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PREFACE 

The TOPS-10 Crash Analysis Guide is a procedural and reference manual 
that you can use to diagnose the causes of TOPS-10 system failures and 
to correct these problems . 

The TOPS-10 Software Notebook Set contains several documents that you 
should use while analyzing system crashes. In particular, you will 
find the TOPS-10 Monitor Tables Descriptions and the Stopcodes 
Specification are most important for symbol definitions, and the 
TOPS-10 DDT Manual is a useful reference for the debugging tools used 
in the procedures . 

Before you can reliably diagnose and repair system problems, you must 
be able to use DDT commands to examine and patch the TOPS-10 monitor 
modules. You must also be familiar with any local modifications that 
have been made to the monitor . 

There are a few symbols shown in this manual that indicate special 
characters . They are : 

Character Meaning 

'^\ <Control-backslash> is the character to type on 

the CTY to get the attention of the parser. 

$ The ESCape character, or altmode, is used in 

commands to DDT and TECO. 

<CTRL/Z> This control character is used to terminate a 
TOPS-10 process, such as DDT. It is displayed as 



Vll 



CHAPTER 1 
INTRODUCTION 



Crash analysis is used in the process of solving system problems. You 
can analyze a crash by examining a copy of memory that is stored in a 
crash file when the operating system stops running. There are 
different methods of analyzing different types of system problems. It 
may be helpful, for example;, to isolate the cause of a problem as 
either the hardware or the software on a preliminary investigation, 
but it is important to understand and recognize all symptoms of system 
problems, including those involving the interaction of both hardware 
and software . 

This manual describes methods that you can apply to various system 
problems. As you become more familiar with the monitor and the tools 
you use to debug the system, you will be able to customize these 
methods to your own needs . 



1.1 SYSTEM ERROR RECOVERY 

To successfully analyze different types of system problems, you should 
try to view the system as a whole, investigating hardware status and 
software conditions, as well as the interaction of the two. You can 
use many informational tools to detect and correct system problems : 
hardware diagnostics verify the hardware state of the machine, and 
software test packages verify the performance and validity of software 
components . The monitor itself is an excellent test program for both 
hardware and software. It prints and saves information about the 
problems it encounters on the console terminal (CTY) . Each CPU in a 
multiple-CPU configuration has a CTY, where it prints information 
about the stopcodes it encounters, messages for the operator, and a 
log of system events . 

The TOPS-10 monitor and hardware systems are designed to prevent the 
system from crashing when a minor error is encountered. Timesharing 
is only interrupted by an unrecoverable, or fatal error. Most system 
problems are not fatal, and in most cases system operation continues 
normally. 

A hardware or software error that prevents normal timesharing 
operation causes a crash; that is, the system performs certain error 
recovery operations, terminates all user and system jobs, and restarts 
operation with a fresh database. If a hardware or software error is 
serious enough to warrant this procedure, the system is halted and a 
copy of memory is written to disk (or dumped) before the system is 
reloaded. This copy of memory, called the crash file, is useful 
because the system uses this file to record the contents of many 
registers and data structures. This manual describes how to examine 
the crash file to find information that might indicate the reason for 
the crash. 

1-1 



INTRODUCTION 

Not all hardware and software errors cause the system to crash. The 
software is equipped with a number of special error recovery 
procedures to continue operation after a system or user error. The 
software generates a stopcode, which provide the system manager with 
information about the cause of the error, and lists system modules and 
data locations useful in analyzing the source of the stopcode. This 
information is printed on the system' s CTY to inform the operator of 
the status of the system. A continuable stopcode does not cause a 
system reload or halt, but, in most cases, produces a crash file. 

A system error that causes a crash, like a program error that causes a 
halt, is called a fatal error, because all the jobs on the system must 
be halted and restarted. The system records as much information as 
possible before the crash. However, in the act of reloading memory or 
processing a hardware error, the operating system may lose or 
overwrite applicable data locations, and a certain amount of 
information may be lost . In every crash, it is important to be aware 
that information recorded during the crash may be invalid or 
corrupted. 

The way the monitor processes the error depends on the type of failure 
that occurred. The method you use to analyze the crash depends on the 
type of information that the monitor saved before the crash. This 
manual is organized to provide crash analysis information for 
different types of crashes. Remember that this manual can only 
explain ideal and general situations. As the system analyst, you 
should be familiar with the specific aspects of the system you are 
analyzing, because you may face unique problems at your site. If 
possible, review the system build procedure, especially the 
information about hardware and software configuration. This type of 
information is described in the TOPS-10 Software Installation Guide . 

DIGITAL provides software error reporting and revision services for 
problems you cannot solve. If you cannot solve a problem that 
prevents system operation, submit a Software Performance Report (SPR) 
through your DIGITAL Service Representative. Be sure to include all 
the information required to analyze a system crash. This manual 
describes that information. 



1.2 TYPES OF ERRORS 

The hardware and software handle each type of system problem 
differently. Most problems do not result in a crash; many errors are 
handled locally for a specific program or device, without affecting 
the entire system. For example, TOPS-10 is designed so that 
unprivileged user jobs cannot directly crash the system. If a user 
program develops a fatal error, the monitor aborts the program without 
affecting the other users on the system. If the monitor data base 
entries for a particular user job are destroyed, the monitor tries to 
eliminate the job without affecting other jobs. However, changes to 
system-wide variables such as those affecting memory and CPU usage may 
cause the system to crash. 

In almost all cases, the software detects and handles errors by 
gathering information and taking corrective action. In the case of a 
fatal error, the system reloads automatically. Fault continuation 
allows the system to correct certain types of errors and continue 
operation without affecting the execution of user programs. In most 
cases, corrective action affects only the process at fault. Such 
action might include repeating an I/O operation or stopping exection 
of a user job. 
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Fault continuation allows the system and user jobs to continue with 
little or no interruption, but continuable stopcodes are recorded on 
the CTY for later examination. It is important to be aware of all 
previous errors in the process of analyzing a crash, even those that 
did not directly cause the system to crash. Internal discrepancies 
that corrupt an important data structure may in turn affect other 
routines, and the error propagates, or the software goes into an 
infinite loop. 

Crash files and CTY listings are the main sources of information about 
the system before the time of the crash. However, error recovery code 
can contain errors of its own. The history of a crash, including data 
from the time leading up to the crash, is an important source of 
information in these situations. 

When the system crashes, you must be prepared to verify that the 
system actually crashed, and determine the extent to which the 
software was affected. You must isolate the problem that caused the 
error by defining the point in the code where the error was detected, 
then identify the problem that caused the error condition, record that 
information, and correct the problem if possible. 

This procedure, and the tools you will need to analyze crashes, are 
described in the following chapters . Remember that your success in 
these areas depends on many factors, and that it may not be possible 
to correct the error immediately. It is more important to continue 
system operation as soon as possible. Later, you can address the 
crash using the tools described in this manual. 



1.3 CRASH ANALYSIS TOOLS 

To analyze a system crash, you need several sources of information, 
and you must use system programs to examine the information. You must 
use all your knowledge of the DECsystem-10 and the TOPS-10 monitor, as 
well as the GALAXY system, ANF-10 network communications, and all 
other software running on the system. The specific sources of 
information about a system crash are: 

o The CTY output for the time before the crash 

o The crash file 

o Listings or microfiche of the monitor sources, describing the 
algorithms, data structures, symbols, and bit definitions 

o The operator log book 

o The Monitor Tables descriptions from the TOPS-10 Software 
Notebook Set 

You will use the following tools in analyzing system crashes. 

o FILDDT (File DDT) allows you to examine files or the running 
monitor. Sections 2.3 though 2.4 describe FILDDT. 

o EDDT (Exec DDT) allows you to examine, breakpoint, and patch 
the running monitor. Section 6.2 describes EDDT. 

o CRSCPY copies crash files and stores information about them 
in a database. The TOPS-10 Operator^ s Guide describes 
CRSCPY . 
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o SPEAR creates reports, based on the system error log file 

(ERROR. SYS), which are useful for tracing non-fatal errors 

that may have led to the system crash. Refer to the 

TOPS-10/20 SPEAR Reference Manual for more information about 

SPEAR. 

o OPR, the operator interface to the DECsystem-10, provides 
commands that allow you to change the system configuration 
and to control software processes. Refer to the TOPS-10 
Operator^ s Command Language Reference Manual for more 
information about this program. 

You will also need to use a text editor such as TECO to patch the 
monitor sources or system startup files after you have solved a 
software problem. 



1.4 CRZ^H ANALYSIS PROCEDXJRE 

To isolate a system problem, you must use FILDDT to examine the crash 
file. The crash file records the state of the system at the time of 
the crash, including information you can use to determine the cause of 
the crash, such as : 

o Processor mode (user, user I/O, or exec mode) 

o Stack pointer and stack in use 

o Contents of accumulators 

o Stopcode information 

First you must obtain the crash file. In Chapter 2, you will learn 
how the monitor creates and maintains crash files . Chapter 2 also 
contains procedures for loading the monitor symbols for FILDDT and 
using the symbolic FILDDT to examine a crash file and extract the 
information listed above. 

Chapter 3 explains how to interpret the information you obtain from 
the crash file, to determine the state of the system at the time of 
the crash. 

Chapter 3 contains a discussion of processor modes, job scheduling, 
and the priority levels that the monitor uses in timesharing, and how 
the information from the crash file can point to the faulty code that 
caused the crash. 

After you have determined the monitor process that failed, you can 
begin to investigate the crash file for the actual routine that 
failed. Chapter 4 contains a description of the monitor's data 
structures and how to obtain information about them from the crash 
file and the source code. 

The monitor may crash, or hang without crashing, because an error has 
occurred in the error handling and recovery procedures . Chapter 5 
contains descriptions of the the system error recovery routines . 
Continuable stopcodes are described in more detail . You can use the 
information in this chapter to determine whether error handling 
routines are functioning properly. 
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It is sometimes necessary to analyze and correct a system error while 
the monitor is running, either because a system reload does not 
correct the error, or the error only becomes apparent while the system 
is running. If you encounter a problem that defies analysis using 
FILDDT to examine crash files, you can use EDDT to examine and correct 
locations in the running monitor. For e:}tample, if the system halts or 
hangs without dumping or without reloading, or if a problem exists 
that' does not interfere with timesharing, you can use EDDT to examine 
the running monitor. This procedure is described in Chapter 6. 

Appendix A contains a Glossary of the acronyms used in this manual. 

Appendix B contains illustrations of the general layout of monitor 
code in virtual address space, for TOPS-10 Version 7.04. 
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CHAPTER 2 
EXAMINING A CRAiSH FIUS 



When the system crashes, the monitor attempts to record information 
about the state of the system at the time of the crash. Normally, the 
system writes a copy of memory to disk before beginning system reload 
operations. This copy of memory is called a crash file, or just "a 
crash" . You can examine this file using a special version of DDT 
called FILDDT. This chapter explains in more detail how the crash 
file is created and how to locate the crash file for a particular 
crash. The procedure for preparing FILDDT so that you can examine the 
crash file is also described, as well as some of the information that 
you can obtain immediately by examining the CTY output of stopcode 
information. 



2.1 CREATING A CRASH FILE 

When a stopcode occurs, BOOT automatically creates a crash file of the 
contents of memory, called CRASH.EXE, and copies it to the system 
crash list. If BOOT cannot dump memory automatically, you can force a 
dump by typing the following command on the CTY: 

BOOT> str:/D 

Use /D to force the crash file to be written. You may include the 
name of a file structure (str:) . 

If this action fails, the CRASH.EXE file on every file structure in 
the system crash list may be unprocessed by CRSCPY. 

The allocation of CRASH.EXE space is accomplished when you define file 
structure information in the ONCE dialog. You can modify the amount 
of space reserved for crash files by running the monitor in user mode. 
Refer to the TOPS-10 Software Installation Guide for complete 
information about ONCE. 

To stop the machine when a malfunction occurs, deposit a non-zero 
value into physical location 30. The monitor checks this location at 
every clock tick. If it finds a non-zero value, the monitor jumps 
into BOOT. You can initiate this procedure using one of the following 
commands . 

The first example is a command to the PARSER on a KL system. Type 
<CTRL/backslash> where you see '^X. In the following examples, 
semicolons precede comments that should not be included in your input . 

'^\ /invoke the PARSER 

PAR>SHUTDOWN ; shut down the system 

[Dumping on DSKA:CRASH.EXE [1, 4] ] 
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For a KS system, you type the following commands : 

■^X ; invoke the console 

ENABLED 

KSIOSHUTDOWN ; shut down the system 

USR MOD 
[Dumping on DSKA: CRASH.EXE [1, 4] ] 

If the monitor can reach clock level, this command will start BOOT. 
BOOT stops the machine, writes a crash file, and begins automatic 
reload procedures. If the monitor has been up less than five minutes, 
BOOT starts, but does not initiate the dump and reload action. 
Instead, BOOT prints the BOOT> prompt and waits for you to type a 
command. 

If the SHUTDOWN command is ineffective, you must instruct the monitor 
to begin system shutdown procedures . The following commands to the 
PARSER accomplish that on a KL system: 

•^\ /invoke the PARSER 

PAR>SET CONSOLE MAINTENANCE 

PAR>HALT 

PAR>EXAMINE KL 

PAR>JUMP 407 

This instructs the monitor to execute the instruction at location 407, 
which signals the policy CPU to initiate a system shutdown procedure. 
In multiple-processor systems, it may be desirable to initiate system 
shutdown procedures on the current CPU instead of the policy CPU. To 
accomplish this, jump to location 406 instead, using the following 
command: 

PAR>JUMP 406 

For the KS, you might use the following procedure to force a system 
shutdown : 

''X ; invoke the console 

ENABLED 

KS10>HALT ; halts the system 

KSIOMR /forces exec mode 

KSIOSM ; halts at default location 

KS10>ST 407 ; loads BOOT 

USR MOD 

You should try to use the SHUTDOWN procedure first, because a forced 
reload does not save the PC, and there is danger of losing device and 
interrupt status information . 

After a fatal stopcode or a manual dump operation, BOOT displays the 
following information on the CTY: 

[Dumping on DSKA: CRASH.EXE [1, 4] ] 
[Loading from DSKA: SYSTEM.EXE [1, 4] ] 

As the second message indicates, BOOT automatically reloads the 
monitor. The automatic reload function can be disabled using the OPR 
program. This function is useful when debugging the monitor, as 
described in Chapter 6 . 

The CRSCPY program runs when the system is reloaded, to copy the 
CRASH.EXE file to a unique file name that will not be superseded by 
subsequent CRSCPY runs. If your system did not run CRSCPY when it 
reloaded, you must copy the CRASH.EXE file to a safe area manually. 
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As soon as you can log into the system, save the crash in the 
XPN: area of the disk structure by typing the following command: 

.R CRSCPY 
CRSCPY>COPY 

The CRSCPY program copies the file using a unique file name and 
reports it when the operation is finished. For more information about 
CRSCPY, refer to the TOPS-10 Operator'' s Guide. 

You can use SYSTAT to obtain an overview of the status of the system 
at the time of the crash. Use the /X switch to SYSTAT to indicate a 
crash file, and include the name of the crash file. For example, to 
examine the SYSTAT information for a crash file named SER003.EXE, type 
the following coimtiand: 

. SYSTAT/X XPN; SER003.EXE 

The /X switch specifies that the SYSTAT program should read the file 
XPN: SER003.EXE (the file name assigned by CRSCPY) instead of the 
running monitor. 



2.2 USING FIItDDT 

FILDDT is a system debugging tool designed for debugging files that 
are stored on disk. Because FILDDT is a modified version of DDT, you 
must be familiar with DDT before you attempt the procedures described 
in the following sections. For more information about DDT, refer to 
the TOPS-10 DDT Manual. 

FILDDT has all the commands of regular DDT, with one major difference: 
commands that control program execution do not work. Those commands 
are : 

$G Start the program. 

$X Execute a single instruction. 

$P Proceed with execution. 

$B Set breakpoints . 

The monitor, because of its large size, runs with local and global 
symbols removed. You cannot examine the crash file without these 
symbols, so you must load the symbol table of the monitor into memory 
with FILDDT and save the modified version of FILDDT. To create this 
special monitor-specific FILDDT, follow the procedure explained below. 

First, run the standard version of the FILDDT program: 

.R FILDDT 

File: 

You must type the name of the file from which the symbols are to be 
loaded. This file must be the runnable monitor; that is, the monitor 
before loading (often SYS : SYSTEM.EXE) . Include the /S switch to 
indicate that symbols are to be loaded. 

File : SYS : SYSTEM . EXE/S 
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The /S switch tells FILDDT to load the symbols for this file. When 
FILDDT displays another File: prompt, type <CTRL/Z> to exit from 
FILDDT, then type the SAVE command to the monitor with the file name 
you choose for the the symbolic FILDDT, to save the runnable file. In 
the following example, the symbolic FILDDT is called MONDDT. 

File:'^Z 

.SAVE MONDDT 
MONDDT saved 

After you save the symbolic FILDDT program, you can use the RUN 
command to start the new FILDDT at any time. For example, the 
following commands start the symbolic FILDDT and give it the name of a 
crash file (XPN: SER003 .EXE) to examine: 

.RUN MONDDT 

File : XPN : SERO 3 . EXE 

When FILDDT reads the crash file, it reports the mapping of the AGs in 
the following message: 

[Looking at file DSKA: SER003 .EXE [10, 1] ] 
[Paging and AGs set up from exec data vector] 

The monitor locations saved ih the crash file must now be mapped to 
the virtual monitor addresses . FILDDT provides special commands for 
mapping the monitor and the user address space. Before you issue a 
mapping command, FILDDT assumes all locations are physical references. 



2.3 ESTABLISHING PROPER M2\PPING 

Virtual addressing machines require special consideration. 
Instructions in programs are loaded into memory by a mapping scheme 
based on page maps . The actual physical location of a word in the 
monitor will not necessarily be the same as the virtual location. 

The symbolic FILDDT contains the virtual address of each location, but 
not its physical address . You must map FILDDT memory references 
through the Exec Process Table (EPT) to examine monitor locations, or 
through the User Process Table (UPT) to examine user locations. To 
establish mapping, you must perform the following steps : 

1 . Find the page numbers of the page maps . 

2. Issue the FILDDT mapping instruction (a $nU command) . 

3. Verify that the mapping is correct. 

The following sections describe two methods for mapping the dump and 
obtaining preliminary information concerning the state of the 
processor at the time of the crash. The instructions used in the 
following procedure may be included in a FILDDT command file (also 
called a patch file) . 

To map a crash, you must provide FILDDT with pointers to mapping 
tables and other locations in the monitor. The mapping tables and 
monitor locations are described in more detail in Chapters 3 and 4 . 
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2.3.1 FIIiDDT Mapping Instructions 

FILDDT allows you to specify the type of address mapping to use in 
locating information. You can specify virtual or physical addressing. 
The mapping instructions are : 

$U enables virtual addressing. This instruction also sets the 
FAKEAC flag, indicating that physical locations 0-17 are to 
be interpreted as the user accumulators (AGs) . 

$$U enables physical addressing. The FAKEAC flag is cleared, 
indicating locations 0-17 are interpreted as hardware 
registers 0-17. 

By default, physical addressing is enabled. FILDDT interprets all 
addresses as physical until you issue a virtual mapping instruction. 
The mapping is correct only for the data in portions of the monitor's 
low segment, because the low segment virtual addresses equal the 
physical addresses. 

The TOPS-10 monitor uses KL-paging, also called "extended addressing" 
(described in Section 3.3) . By default, FILDDT is enabled for 
KL-paging. If it is necessary to disable KL-paging (for an older 
version of the monitor, for example) , you can issue the following 
command to FILDDT: 

0$11U 

To enable KL-paging, type the following command: 

1$11U 



The command n$llU establishes the mapping scheme so that 
read the page maps correctly. 



FILDDT will 



Next, you must point FILDDT at the correct page maps that associate 
virtual addresses (loaded into the symbolic FILDDT) with the physical 
addresses (saved in the crash file) , and establish virtual mapping. 



2.3.2 Mapping the Crash 

To map virtual addresses to physical ones, FILDDT needs the locations 
of the Exec Process Table (EFT) and the Special Pages Table (SPT) . 
The EPT allows FILDDT to map exec virtual memory. The SPT is used to 
map the user job that was running at the time of the crash. 

On a multiple-processor KL system, the dump contains an EPT for each 
CPU in the system. To analyze the dump, you must map FILDDT through 
the EPT for the CPU that crashed. A CPU Data Block (CDB) exists for 
each CPU in the system. On a single-processor system, there is one 
CDB. The CDB contains the address of the EPT. Therefore, you must 
first find the CDB for the CPU that crashed. The location DIECDB 
contains the pointer to the CDB of the CPU that crashed. 

NOTE 

The contents of DIECDB are written when the system 
crashes, but not when the system hangs. When you are 
analyzing a hung system, the contents of DIECDB (if 
nonzero) were written by a previous crash, and 
therefore may be invalid. 
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You can see the contents of DIECDB by typing the following command to 
FILDDT: 

DIECDB [ 12000 

In this example, the physical starting address of the CDB is 12000. 
The location of the EPT is stored in the CDB at the offset symbolized 
by .CPEPT. Use the following command to open .CPEPT and read its 
contents : 

$Q+. CPEPT-. CPCDB[ 1000 

The first part of the instruction ($Q) refers to the last value 
displayed (that is, the contents of the currently open location) . 
This value is 12000. Starting from location 12000, the pointer moves 
to the offset indicated by the difference between the values of .CPEPT 
and .CPCDB. The new location is the offset into the CDB of the EPT 
address (.CPEPT). The instruction opens the location .CPEPT and 
displays its contents . The EPT address is displayed as physical 
location 1000. 

FILDDT needs the page number for the EPT, not its physical address. 
Therefore, you must divide the contents of .CPEPT by 1000. 

Submit the result of this division operation to FILDDT using the $0U 
command. For example, to calculate the page number and map the EPT, 
type the following FILDDT instruction: 

$Q^1000$0U 

This command divides the previous value (using the $Q command) by 1000 
and submits the result to FILDDT as the EPT page number. In this 
example, the page number is 1. 

Exec virtual memory is mapped after the $0U command. This is 
sufficient for examining monitor memory locations in the crash. 
However, to examine user data, you must map the current user job. The 
FILDDT command n$6U maps the user job and its associated per-process 
storage in exec virtual memory (funny space) . The value of n is the 
page number of the UPT (User Process Table) . 

The SPT contains a word for the current job running on each CPU in the 
system, plus a word for each user job. The right half of each SPT 
slot contains the page number of the UPT for the current CPU. When 
extended addressing is enabled, the SPT points to the UPT. 

The following FILDDT command sets the SPT base address: 

JBTUPM+ ( job#) - (CPU#) $6U 

To map a user job other than the current job on the current CPU, add 
the contents of the right half of JBTUPM to the job number, then 
submit the result to the $U command. 

FILDDT provides temporary registers to contain either hardware 
registers or user accumulators. When hardware mapping is established, 
FILDDT assumes that locations 0-17 refer to hardware registers 0-17. 
However, when you issue a virtual mapping command ($U) , the user ACs 
can be mapped through the temporary registers . This allows you to 
load the user ACs into the temporary registers and then refer to the 
user ACs as locations 0-17. 
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You can use the following FILDDT instruction to map the current AC 
block to the temporary registers provided by FILDDT. The instruction 
to open and map the current AC block is: 

.CPACA[ $Q$5U 

This instruction is useful only if the location .CPACA contains the 
address of the current AC block. If, however, a UUO at interrupt 
level occurs (UIL stopcode) , this instruction cannot be used 
successfully. Instead, you must determine the location of the current 
AC block by defining the interrupt level in progress at the time of 
the crash. The AC blocks and interrupt levels are described in more 
detail in Chapter 3 . 

The user job in memory may not match the UPT currently in use at the 
time of the crash. You can check the user job that was running by 
comparing the contents of offset .CPJOB in the CDB with the contents 
of .US JOB in the UPT. If these values do not match, the interrupt 
routine was switching UPTs at the time of the crash; use the UPT for 
the job number that is in .US JOB. 

Look at the code that you are familiar with, in the high segment, to 
make sure the dump is mapped correctly. Also check location 410 
(ABSTAB) , which should point to NUMTAB, which is one of the first 
locations in the low segment. 

If you set up mapping through the wrong page map, FILDDT returns a 
question mark whenever you try to reference an unmapped location. For 
example, this could occur if you use the null job's UPT to set 
mapping. To reset mapping, use the "$$U" command to set physical 
mapping by FILDDT. 



2.4 VERIFYING THE DUMP 

Occasionally, your monitor will crash in the process of upgrading to a 
new version, or when you are making modifications to the code. In 
these cases, it is possible that your crash file will be based on a 
different version of the monitor than the monitor-specific FILDDT you 
created. You should make sure that the symbols in the 
monitor-specific FILDDT match the crash that you are examining. If 
values of the symbols do not match, the information in the crash file 
may be useless, misleading, or corrupted. 

There are several ways to check the symbols . One is to make sure the 
version number of the crashed monitor matches that of your current 
monitor. Another is to examine addresses in the monitor with known 
contents and verify that they contain the right information. 

Monitor location CNFDVN contains the monitor version number and edit 
number. This version number should match the version number displayed 
by the DIRECTORY monitor command. 

.DIRECTORY IEZ093.EXE 

IEZ093 EXE 8196 <155> dd-mmm-yy 704(33432) DSKB:[10,1] 

. RUN MONDDT 

Filer DSKB: IEZ0 93.EXE [10, 1] 
[Looking at file DSKA: SER003 .EXE [10, 1] ] 
[Paging and ACs set up from exec data vector] 

$$C 

CNFDVN/ 70400, ,33432 
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Note that the DIRECTORY command reports version and edit numbers 
704(33432), matching the contents of CNFDVN: 704 in the left half, 
and 33432 in the right half. 

You can obtain the name of the monitor by reading ASCII text starting 
at location CONFIG, as shown in the following example: 

CONFIG$0T/ RL371A DEC 10 Development 

In this case, the full system name is "RL371A DECIO Development". 

If these values match, you can be relatively sure that the 
monitor-specific FILDDT and crash file match. 



2.5 FILDDT COMMAND FILES 

FILDDT command files are used to map a dump and obtain preliminary 
information that might be relevant to analyzing the crash. A command 
file is a set of FILDDT commands that are executed automatically when 
you issue the $Y cormnand to FILDDT. Command files are also used to 
edit the runnable monitor (as opposed to making edits to source 
modules and rebuilding the monitor) . 

The FILDDT command $Y invokes a series of FILDDT commands stored in a 
file on disk. This allows you to easily execute a set of commands 
that you use frequently instead of typing them in. You could use a 
command file to map and verify a dump and to extract information you 
are likely to need while diagnosing a crash, as described below. 

NOTE 

The $ (dollar sign) is displayed when you press the 
ESCape key in FILDDT. It is used here to show where 
you must insert an ESCape character into the file. 
Most text editors require a special procedure for 
inserting ESCape and other non-printing characters 
into a file. You must use the text editor 
documentation to find the method for quoting 
characters if you do not know how to insert an ESCape 
character into a file. 

The following command file maps a crash file for a multiple-processor 
KL system. The same command file is equally useful on a single-CPU KL 
or a KS system. The command file also verifies the correspondence of 
the dump with the monitor-specific FILDDT and displays pertinent 
system information about the crash. 

Comments are included here to describe the functions of the commands. 
However, FILDDT will not accept a command file with comments. Your 
actual command file should NOT contain the comments in the following 
example : 

. TYPE VERIFY. DDT /display contents of patch file 

DIECDB[ /gets addr of CDB for CPU that crashed 

$Q+.CPEPT-.CPCDB[ /gets addr of the EPT 

$Q'1000$U /divides addr by 1000 to get page number 

SPTTAB$6U /sets the SPT base address 

.CPACA[$Q$5U /maps AC references 

.CPCPI[ /gets PI status 

.CPPGD[ /gets DATAI PAG results 

.CPSPT[[ /gets the address of the SPT 

.CPDWD[ /gets CPU's DIE interlock word 
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.CPCPN[ 
.CPJOB[ 
.USJOB[ 
.CPTCX[ 



; CPU number of crashed CPU 
;gets job number of current job 
; job number in funny space 
/process context word on page fails 



You can include these and other FILDDT commands in a command file to 
obtain initial information about the crash. The locations referenced 
in this file are described in Chapter 3. 

The following example shows the types of information that might be 
displayed and how to interpret the information. Again, the comments 
are included for descriptive reasons, but comments are not allowed in 
an actual command file. 



.R MONDDT 



; run the symbolic FILDDT 



File: SYS: CRASH 

[Looking at file DSKA: SER003 .EXE [10, 1] ] 
[Paging and ACs set up from exec data vector] 



$Y 

File: MON.DDT 

DIECDB[ 13000 
$Q+.CPEPT-.CPCDB[ 
$Q'1000$U 
SPTTAB$6U 



; execute a command file 
/command file is MON.DDT 

;the address of the CDB for the 
;CPU that crashed is 13000 

3000 /compute the offset into the CDB 
/address of the EPT is stored 

/ compute the page number of the EPT 
/and point FILDDT to the EPT 



CPACA [ 
CPCPI [ 
CPPGD [ 



402077$Q$5U 

377 

700100, ,2600 



CPSPT[SPTTAB+1 [ 2600 



/set the SPT base address 

/map AC references 

/377 indicates PI levels are enabled 

/ DATAI PAG shows that : 
/current AC block is (exec) 
/previous AC block is 1 (user) 
/previous context section is (exec) 
/UPT page number is 2 600 

/shows UPT page number of currently 
/mapped job on this CPU 

/Die interlock word 

/CPUl failed 

/ Job 5 was running 

/Job 5 is mapped on this CPU 

/Process context information: 
/current AC block is 1 (user) 
/previous AC block is 1 (user) 
/previous context section is 
/user base page number is 2364 

It is important to compare the value of .CPTCX with the contents of 
.CPPGD. The process context word stored in .CPTCX and the DATAI PAG 
word stored in .CPPGD are different when the state of the processor at 
the time of the crash is indeterminate (for example, for IME or EUE 
stopcodes) . 



. CPDWD [ 





. CPCPN [ 


1 


. CP JOB [ 


5 


.USJOB[ 


5 


. CPTCX [ 


7 



701100, ,2364 
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2.6 STOPCODE INFORMATION 

The following information is useful when the system crashed with a 
stopcode. You can determine the stopcode information by looking at 
the CTY for the CPU that crashed. The stopcode name is printed on the 
CTY, and is stored in location .CnSNM, where n is the CPU number. Use 
the Stopcodes Specification in the TOPS-10 Software Notebook Set to 
look up the module that generated the stopcode . 

The stopcode routines in the monitor also store and print the 
following types of information on the CTY: 

o Date and time of crash 

This information is stored in a series of locations starting 
at LOCYER: 

LOCYER - Year of the crash 
LOCMON - Month of the crash 
LOCDAY - Day of the crash 
LOCHOR - Hour of the crash 
LOCMIN - Minute of the crash 
LOCSEC - Second of the crash 

Remember to display these locations in decimal, not octal. 

o Current job 

The word at address .CnJOB holds the job number of the 
current job on CPUn. 

o PPN of current job 

The PPN is stored in the JBTPPN table, indexed by the job 
number . 

o Program name of current job 

The program name is stored in SIXBIT in the JBTNAM table, 
indexed by the job number. 

o Terminal of current job 

The terminal name is stored in SIXBIT in the first word of 
the Terminal DDB, pointed to by TTYTAB (indexed by job 
number) . 

o CPU number 

The CPU number of the CPU that crashed is determined from the 
value of .CnDVTD (where n is the CPU number) . Test this 
symbol for a negative value (-1) for each CPU in a 
multiple-CPU system. A negative value indicates that the CPU 
did not crash. If the contents of .CnDWD are equal to zero, 
the current CPU is the CPU that crashed. 

Refer to Section 5.2 for more information about the types of stopcodes 
and the information they provide. 
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CHAPTER 3 
LOCATING THE FAILURE 



The monitor is the portion of the software that is responsible for 
interfacing user programs to hardware. Specifically, the monitor is 
responsible for the following functions: 

1 . Performing tasks for a user before and after running a 
program, such as copying or deleting files, finding the 
status of the system, and running or stopping programs . 
TOPS-10 provides the user interface in the form of the 
command language. 

2. Executing the program. The user must make requests for all 
services (including I/O) . The user programming interface is 
standardized in the form of monitor calls, also called 
Unimplemented User Operators (UUOs) . 

3. Providing access to the data base. This is done by creating 
a logical file system for data stored on disk devices. 

4. Controlling CPU usage. A timesharing system must know how to 
determine who should get control of the computer. This is 
called scheduling. 

5. Controlling memory usage. For the system to run efficiently, 
jobs must be moved in and out of memory at the right time. 
This operation is known as swapping and paging. 

6. Controlling access to sharable devices. The main sharable 
devices on timesharing systems are disks. Because many jobs 
will be using files on the same disk drive, adequate control 
must be maintained to prevent destructive interference. 

7. Controlling access to spLngle-user (non-sharable) devices. 
The monitor must implemenit a way to allocate these devices to 
the right users and contrbl the I/O. TOPS-10 does this with 
the GALAXY batch and spooJLing system. 

8. Providing error analysis When hardware or software errors 
occur (DAEMON and SPEAR) . 

9. Providing accounting information so the system can be fairly 
allocated and users charged for what they use (ACTDAE) . 
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3.1 HAEUDWARE MAPPING 

The hardware uses three types of tables to establish and maintain 
mapping of locations in memory for a job: process tables, section 
tables, and page tables: 

o The process table describes characteristics for a specific 
job and includes a pointer to each section map required to 
map the job. There are two process tables: the Exec Process 
Table (EFT) and User Process Table (UPT) . 

o The section map contains pointers to the page map for each 
virtual section for the monitor or user job. 

o The page maps contain locations for each physical and virtual 
page allocated to the monitor or user job. 

The paging system uses two process tables : the UPT to map the user 
job and the EPT to map the monitor. The UPT (User Process Table) is 
the table used to describe user address space. Each user job has its 
own UPT, which must be loaded before the job can be run. The EPT 
(Exec Process Table) is used to describe the monitor address space. 

The processor runs by switching between user mode and exec mode. To 
perform address translation quickly, the hardware must know the 
locations of the process tables . Two registers are used to find the 
process tables: the User Base Register (UBR) points to the UPT and 
will vary for each job that is loaded into memory. The Exec Base 
Register (EBR) points to the EPT. On multiple-CPU systems, each CPU 
has an EBR and a UBR at all times. 



3.2 PAGING POINTERS 

The page maps contain pointers to physical pages of data. The page 
maps are read by the microcode, which evaluates two kinds of pointers: 
section pointers that point to section maps, and page map pointers 
that point to physical pages. Section and page pointers have 
identical formats. There are four types of pointers, indicated by a 
code stored in Bits 0-2 of the word. The access code is applied to 
the address by ANDing Bits 3-6 of all pointers used to evaluate the 
address . 

The pointer to non-accessible pages has code (0) in Bits through 2. 

The pointers to accessible pages also include accessibility codes in 

Bits 3 through 6. Bit 3 (P) , if set, indicates that the page is 

public. Bit 4 (W) indicates whether the page is writable, and Bit 6 
(C) indicates whether the page can be cached. 

Bit 5 of the pointer to an accessible page is used by the MCA25 
harware option as the "Keep Me" bit. That is, if Bit 5 is set in the 
page pointer, the address translation for that page is not cleared in 
the hardware pager, providing that the DATAO PAG (context switch) is 
issued with Bit 3 set. 
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3.3 EXTENDED ADDRESSING 

The KL processor uses KL-paging to allow code and data to be grouped 
into virtual sections; each section is a maximum of 512 pages of 
virtual memory. The monitor layout for a KL with extended addressing 
enabled is illustrated in Appendix B. 

The KS processor does not support extended addressing. However, 
because KL-paging is required in order to run TOPS-10 Version 7.04, 
the KS processor simulates KL-paging by choosing an alternate page map 
when necessary. 

The primary page map for the KS monitor is the Section page map. To 
perform a monitor call to an extended section, the KS monitor changes 
the page map pointer. For example, to execute the DNET. monitor 
call, a special macro reads the Section 2 page map pointer (from 
SECTAB+2 in the KPT) and writes the address into the Section page 
map pointer (at SECTAB in the EPT) . The KS accesses locations in the 
Section 2 page map until the monitor call has been serviced. A 
similar macro restores the Section page map pointer to SECTAB. 



3.4 MONITOR-RESIDENT USER DATA 

Some information that pertains to the specific user is kept in the 
monitor's address space, in the exec page maps. Each word in a page 
map can point to a physical page in memory, but the Section Page Map 
also contains indirect pointers to the UPT. The monitor uses these 
virtual addresses to reference job-specific locations, such as funny 
space. 

The job-specific data in monitor address space is composed of the 
following areas, which are described separately below. 

Funny Space (Per-Process Area) 

UPT 

.UPMAP (Section page map) 

.UPMP/.UUPMP (UPT origin) 

JOBDAT 

Vestigial JOBDAT 

The information in these pages is specific to the current user, so the 
job's page maps in the crash file contain virtual and physical 
addresses. In a multiple-CPU system, the SPT (Special Pages Table) 
for that CPU contains the current user page map page. When a new job 
is selected to run, only the UBR and the SPT words need to be changed. 

Certain pages of the executive virtual address space are designated as 
the per-process monitor free core, also known as funny space, for the 
job that is currently running on that CPU. This is monitor memory 
that is swapped with the job, and contains information pertaining to 
its disk DDBs, monitor buffers, SWITCH.INI, the extended channel 
table, and so forth. 

The monitor references the user's funny space with the symbol .UPMP, 
which points to the first location in the UPT, and reads the physical 
location in memory from the page table for user page . 

User page contains JOBDAT locations, which are used by the monitor 
for handling the user job. 

Vestigial JOBDAT is the job data area for the job's high segment. 
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3.5 PROGRAM COUNTER WORD 

The PC (Program Counter) double-word contains the location of the next 
instruction that the system will execute, including flags to indicate 
whether the processor is in user mode or exec mode. The PC is stored 
in the job's UPT (at USRPC) and in the CDB (at .CnPC) . When you 
analyze a crash, you must examine Bit 5 of the PC word to determine 
whether the processor was in user mode or exec mode at the time of the 
crash. If Bit 5 of the PC is set, the processor was in user mode. If 
Bit 5 is clear, the crash occurred in exec mode. The remaining PC 
flags indicate arithmetic overflow conditions and so forth. 

The PC contains a thirty-bit address, which points to the next 
instruction to be executed. When control passes to a section other 
than the section where the instruction was issued, that instruction 
must refer to a 30-bit address. To store the 30-bit PC with flags, 
the flag-PC doubleword is used. The flag word contains the PC bits in 
Bits 0-12, in a format identical to the single-word PC. Bits 13-17 
are unused. The right half of the first word is used by the hardware. 
The second word contains the page number and address. Bits through 
5 of the second word are zero. The format of the PC doubleword allows 
the flags (including the mode bit) to be read in the same manner as a 
single-word PC. You can also read the address in a double-word PC in 
the same way as a single-word PC, after you add 1 to the location of 
the PC word. 

Most instructions that use 30-bit addresses cannot be issued in 

I Section 0. Global section references are illegal in Section 0, except 

I for the OWGBP instruction, the XJRST and XJRSTF instructions, and the 

XBLT function of the EXTEND instruction. Any other instructions with 

global section references must be made from a non-zero section. 



3.6 PROCESSOR MODES 

The processor reads the PC to determine whether the instruction is to 
be executed in user or exec mode. User mode allows user jobs to run 
programs and request the monitor for system resources . Exec mode 
allows the monitor to satisfy user requests for system resources and 
perform overhead functions. 

You can determine the processor mode at the time of the crash by 
reading the PC word from the CDB . Bits 5 and 7 of the PC word are 
useful in determining the processor mode. If Bit 5 is clear, the 
processor was in exec mode. If Bit 5 is set, the processor was in 
user mode. In user mode, if Bit 7 is set, the job is in public mode; 
if Bit 7 is clear, the job is in concealed mode. In exec mode, if Bit 
7 is clear, the process is in kernel mode. If Bit 7 were set in exec 
mode, this would establish supervisor mode, but this mode is not used 
by TOPS-10. 

Processor modes, PCs, and paging pointers are described in the 
DECsy stem- 10/20 Processor Reference Manual . 
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3.6.1 User Mode 

Normally a user program runs in user mode. When the program requests 
a monitor service, using a monitor call, the current processor flags 
and PC are saved. The program is stopped temporarily while waiting 
for the monitor service to be completed; this is called "blocking." 
Control of the processor is then passed to the monitor in exec 
(kernel) mode by clearing the processor flags and starting at a new 
PC. 

When an I/O operation is requested or completed, a device interrupt 
causes the monitor to service the device. On a regular basis, the 
monitor receives a clock interrupt, which initiates job scheduling and 
system maintenance (overhead functions) . When the clock service 
routine is finished, control passes to the appropriate user program, 
and the processor switches back to user mode by setting the flag bits 
(Bits 5 and 7) and restoring the user's PC. 

A user program runs in either User Public or User Concealed mode. 
User mode begins with a monitor command and ends when the program 
exits or encounters an error. Normally the program runs in public 
mode: Bits 5 and 7 of the PC word are set. The user program runs in 
concealed mode if Bit 7 is clear and Bit 5 is set. 



3.6.2 Exec Mode 

When a user program requests a service by the monitor, using a monitor 
call or a command, the processor must switch from user mode to exec 
mode. Exec mode allows the monitor to perform privileged services and 
provides the user's interface to file management, device control, and 
hardware communication in general. 

User programs run in user mode, and cannot perform direct I/O 
instructions. A range of I/O instructions, with device codes from 740 
to 774, are reserved for customer definition, and are therefore 
designated as unrestricted codes . 

When a UUO is executed, a hardware trap condition occurs, causing the 
the microcode to store the following information in the UPT : 

o PC doubleword 

o 30-bit effective address 

o Opcode and AC (from the instruction) 

o Process context word 

The new PC word is taken from one of the MUUO dispatch locations in 
the UPT, depending on the processor mode and whether or not the UUO 
occurred during the processing of another trap condition (a PDL 
overfolw, for example) . Control passes to the MUUO routine in the 
monitor, where UUO processing begins. The monitor uses AC Block 0. 
The user program uses AC Block 1 . To switch to AC Block from Block 
1, the monitor issues the following instruction: 

DATAO PAG, addr 

Where: addr contains the value [400100, ,0]. 
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When the job is not running, the user accumulators are stored in 
JOBDAT in the user's address space. The monitor's accumulators are 
stored in the next higher locations in the user's address space. 

Once in the MUUO routine, the monitor checks the UUO for legality by 
checking the instruction stored in .USMUO of the UPT. The return PC 
from USRPC in the UPT is placed on the monitor's stack for this job. 
Then control passes to the appropriate routine to perform the function 
for the user. 

The execution of the user function may finish or it may block, waiting 
for something to happen (I/O, for example), before it can continue. 
If control can be returned to the user job, the user AC set is 
restored and control passes to the location pointed to by the PC in 
USRPC. If the job blocks, the monitor goes to clock level. After the 
blocking condition is serviced, the job can run again. At the time of 
the block, the monitor's PC is stored at USRPC in 1;he UPT. 

I The MUUO routine uses a stack, also located in the UPT, which the 
I monitor can address because it is mapped through a monitor virtual 
I address (refer to Section 3.3). 

Some values in the UPT can be cached without interfering with the 
system, such as the stack. These locations are referenced by the 
symbol .UUPMP. Other locations are not cached; they are referenced by 
the symbol .UPMP, which also points to the first location of the UPT. 
On a single-CPU system, the monitor caches the contents of all 
locations in the UPT from .UUPMP to .UPMP. On multiple-CPU systems, 
however, the system only caches the contents of .UUPMP. 



3.7 THE PRIORITY INTERRUPT SYSTEM 

In exec mode, the monitor can service the user program, a device 
request, or a clock-level interrupt. Interrupts can be caused by 
devices or by the clock. While in exec mode, the monitor services 
interrupts according to the Priority Interrupt (PI) level assigned to 
the interrupting process. A typical set of priority interrupt levels 
(also called PI channels) might be: 



Level 



Level 1 
Level 2 
Level 3 
Level 4 
Level 5 
Level 6 
Level 7 



DTE (Byte Xfer, Deposit, Examine only) 

CI/NI (limited set of functions only) 

none 

DTA (DECtape) 

Card reader, APR, clock 

Line printer, magtape, NI, DTE (doorbell) 

Disk, CI 

ANF-10 network 

Monitor 



To distinguish the interrupt level of the system at any one time, four 
pieces of information are used: 

o The set of accumulators currently in use, which reveals the 
stack in use. 

o The processor mode (exec or user) . 

o The status of the PI system. 
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o The process context word. When the monitor is called to 
perform a service for a user job, as with a command or UUO, 
the microcode creates the job's process context word and 
writes it into the UPT. This process context word is 
displayed by a DATAI PAG instruction where Bit 2 is cleared;^ 
and contains the current AC block number, the previous AC 
block number, section bits, and the current UBR (User Base 
Register) . 

A summary of the interrupt levels and how to distinguish them is shown 
in the following table: 



Table 3-1: Interrupt Level Indicators 



AC 
Block 



PDL 



Mode PI Status 



User Job 
Null Job 

UUO Level 

Clock Level 

Device Interrupts 
Terminal driver 
Disk service 
Network service 
Other (level y) 

Page Fail 



1 


Variable 


User 


No 


Pis active 


1 


N/A 


User 


No 


Pis active 





JOBPDO 


Exec 


No 


Pis active 





NUnPDL 


Exec 


PI 


7 active 


2 


CnxPDl 


Exec 


PI 


SCNCHN active 


3 


CnxPDl 


Exec 


PI 


DSKCHN active 


4 


CnxPDl 


Exec 


PI 


NETCHN active 





CnyPDl 


Exec 


PI 


y active 





NUnPDL 


Exec 


Variable 




ERnPDL 









You can find the stack by finding the current set of ACs . The process 
context word, stored in the UPT, contains the current AC block. 

You can determine the status of the priority interrupt system by 
looking at the PI status word, stored at location .CnCPI in the CDB. 
This word is read by the monitor with a CONI PI instruction and stored 
in the CDB when the monitor starts to process a stopcode. Using this 
information you can determine whether the PI system was enabled, what 
PI levels were enabled, and what kinds of interrupts were in progress. 

The PI status word on a KL system has the following format: 



Bits 



Meaning 



0-10 
11-17 



18-20 
21-27 
28 
29-35 



Not used. 

Level on which a program requests an interrupt 
(Bit 11 = Level 1, Bit 12 = Level 2, and so 
forth) . 

Write even parity (KL diagnostics only) . 
Levels on which an interrupt is in progress. 
PI system is on. 
Enabled levels. 
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3.8 THE DEVICE INTERRUPT SERVICE 

A device interrupt occurs when an I/O transfer is complete, a device 
has changed status, or an error has occurred. There are two types of 
device interrupts: vectored and nonvectored interrupts. A 
nonvectored, or standard interrupt, is handled by the software. The 
interrupt handling instruction is read from the EPT and control passes 
to the CONSO skip chain to determine the device that generated the 
interrupt. Section 3.81 describes standard, nonvectored interrupts. 

The DTEs (doorbell function only) , the interval timer (on the same 
level as APR interrupts) , RHIO, and RH20 MASSBUS controllers all 
perform vectored interrupts. Vectored interrupts are not dispatched 
by the software but are automatically dispatched by the microcode. 
Section 3.8.2 describes nonstandard, vectored interrupts. 



3.8.1 Standard Interirupts 

An interrupt can occur on Levels 1 through 7 only if the PI system is 
turned on, there are no higher-level interrupts in progress, and the 
PI system is enabled for interrupts on that level on which the 
interrupt is requested. If these conditions are met, the interrupt 
will stop the processor and turn on a bit in the PI status word. The 
bit indicates the level on which the interrupt is requested. The 
processor then executes the instruction for handling an interrupt on 
the requested PI level. 

The location of the interrupt handling instruction is stored in the 
EPT. The exact location in the EPT is calculated from the following: 

EPT+40+2*n /where n is the PI interrupt level 

The next instruction to execute in the handling of the interrupt is 
stored in the EPT and depends on the PI level on which the interrupt 
was requested. The above calculation results in an offset into the 
EPT where the instruction is stored. Thus, if a BAIO (unit record) 
I/O bus controller is assigned to PI Level 2, the formula would result 
in EPT+40+ (2*2) . The system then executes the instruction stored at 
offset 44 into the EPT. 

Interrupt level is reserved for certain types of I/O transfers with 
DTE and CI/NI (KLIPA/KLNI) devices. Level bypasses the software and 
is handled by the microcode, which handles interrupts on Level 
automatically without requiring the software to store context 
information and so forth. 

In general, the interrupt instructions in the EPT are formatted as: 

XPCW CHnm 

where n is the CPU number (omitted for CPUO) , and m is the level 
number on which the interrupt is in progress. For example, CH7 means 
Level 7 on CPUO . CH27 indicates Level 7 on CPU2 (the third CPU in the 
configuration) . The new PC flags at CHnm+2 usually include the 
previous context user flag. This allows the interrupt service routine 
to access the user's address space using the PXCT instruction. 

The location following each XPCW in the EPT contains an instruction 
that will cause an I/O page fail condition (setting the APR flag) , 
which will usually result in an lOP stopcode. 
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Using a data structure known as the CONSO skip chain, the interrupt 
routine polls the devices on that interrupt level and services the 
interrupt. With the XPCW instruction, control passes to the skip 
chain. Each channel has its own skip chain, starting at the address 
pointed to by CHnm+3, whose function is to find the specific device 
that created the interrupt and then service its needs . 

The monitor performs CONSO instructions to decide which device 
generated the interrupt. If it finds the interrupting device, control 
passes to the interrupt handling routine. If the device is not 
requesting an interrupt, the monitor performs a JRST instruction to 
the next CONSO instruction. If it reaches the end of the CONSO skip 
chain, it dismisses the interrupt with the following instruction: 

XJEN CHnm 

When control passes to the interrupt handling routine, the monitor 
reads the status of the device, using a CONI or DATAI instruction. On 
that basis, it may stop the device, advance buffer pointers, or 
perform cleanup operations . A CONO or DATAO instruction clears the 
device interrupt status. Failure to do so would cause continual loops 
in the interrupt handling routine, and eventually the keep-alive count 
would expire. 

The KL processor uses the following instructions to perform I/O: 



DATAI 


CONI 


DATAO 


CONO 


BLKI 


CONSZ 


BLKO 


CONSO 



KS I/O processing uses the following set of instructions: 

TlOxb 
RDIOb 
WRIOb 
BCIOb 
BSIOb 

When the interrupt routine is completed, control returns to the 
routine that was running before the interrupt (which may be another 
device interrupt at a lower PI level) . Each interrupt routine has its 
own push-down list. The push-down lists are named CnxPDl, where n is 
the CPU number (omitted for CPUO) , and x is the interrupt level (from 
1 to 6) . 

Device service routines preserve the state of the machine as it 
existed before it was interrupted. They can use AC Block 0, as UUO 
level does. Accumulators used by the interrupt routine are saved on 
the stack before processing, and restored when processing is complete. 

The SAVnx routines (n = CPU number, omitted if 0, and x = interrupt 
level) are used to save/switch ACs during device interrupts. For 
example, SAVl is the routine to save the ACs for PI Level 1 on CPU 0; 
SAVll is the routine to save the ACs for PI Level 1 on CPU 1. 
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Certain device interrupt routines have dedicated AC blocks, listed 
below: 

AC Block Used for 

Exec-mode 

1 User-mode 

2 Terminal Scanner Interrupt Service 

3 File Interrupt Service 

4 Network Interrupt Service 

5 Reserved for Realtime Interrupt Service 

6 KL-paging Microcode 

7 Microcode 

Interrupt service routines may also need to use the UPT of a job that 
is waiting for the completion of I/O, rather than the current job. In 
that case, the UBR and SPT must be modified to point to the correct 
UPT, and then switched back when the interrupt is through. The 
monitor routines that accomplish this are SVEUF, SVEUB, and SVPCS . 
When you are examining a dump, be sure to check the correspondence 
between the job and the UPT/SPT. 



3.8.2 Vectored Interirupts 

The KL hardware also uses vectored interrupts, which differ from the 
standard, nonvectored interrupts in that the vectored interrupt goes 
directly to the interrupt -handling routine, using a different 
interrupt location in the EPT. The interval timer, the DTE (doorbell 
function only), RHlOs, and RH20s may do vectored interrupts. 

The DTE interrupts to a location in the EPT, which is calculated as 
follows: 

EPT+142+lOn /where n is the DTE number (0-3) 

For the RH10/RH20 devices, the system has an internal register called 
IVIR (Interrupt Vector) . When an RH10/RH20 device requests an 
interrupt, the EBOX hardware /microcode dispatches to the location in 
the EPT calculated as follows: 

EPT+contents (IVIR) 

This interrupt method allows the disk interrupt to vector for the 
standard interrupt location for that channel, providing device 
independence in the device interrupt handling routine. Thus, the disk 
RHIO or RH20 can load the IVIR with 40+2n and the magtape RHIO or RH20 
will dispatch directly into the middle of the skip chain to service a 
specific controller. 



3 . 9 TRAPS 

Traps differ from interrupts in that they are caused by the execution 
of a specific instruction rather than by some asynchronous event. 
When a trap occurs, the microcode stores the current PC and flags in 
the UPT. A new PC double-word, also in the UPT, specifies where 
control will pass and in what mode the processor will operate (exec or 
user mode) . 
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3.9.1 Page Fail Traps 

When a program attempts to access a page of data that is not 
available, the hardware generates a page fail trap. A page fail trap 
can occur for one of two reasons: the user tries to reference an 
address that cannot be accessed (page not in memory, page 
write-locked) or a hardware error (AR/AEOC parity error, page table 
parity error) occurs. When a page fail trap occurs, the processor 
stores information about the trap in location 500 (.USPFW) of the 
current UPT . This location is known as the page fail word. 

The page fail word is formatted differently for a page reference that 
is not available and for a hardware error. The page reference to an 
address that cannot be accessed has the following format : 

+ + 

|U|l|Failure Code | |V| | Virtual Address I 

+ ■ + 

1 2 5 6-7 8 9 12 13 35 

In either type of page failure, the virtual address is stored in Bits 
13 through 35. Bit is on if the page failure occurred in user 
virtual address space. If Bit is off, the failure occurred in 
executive virtual address space. 

If Bit 1 is on, a hardware-detected error occurred, and the failure 
code is stored in Bits 1-5. The failure codes are: 

Code Meaning 

20 No device response on UNIBUS (KS only) 

21 Proprietary violation (KL only) 

23 Address break (KL only) 

24 Illegal indirect word in EA calc (KL only) 

25 Page table parity error (KL only) 

27 Section number in EA calc greater than 37 (KL only) 

36 AR parity error (KL only) 

37 AKX parity error (KL only) 

If Bit 1 is off. Bits 2-7 have the following format: 

+ + 

|A|M|S|T|P|C| 

+ + 

2 3 4 5 6 7 

Bit Name Meaning 

2 A Indicates whether the mapping is valid (0 

means a page refill is required) . 

3 M Indicates that the page has been modified. 

4 S Reserved for use by the monitor. 

5 T Indicates the type of page reference (0 for 

reading, 1 for writing) . 

6 P Indicates the page is public, if set. 

7 C Indicates whether the page is cachable. 
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At the same time the page fail word is stored, the flag-PC doubleword 
is stored at .USPFP (location 501) in the UPT and control passes to 
the address stored at .USPFP+2 (location 503), which usually contains: 

EXP SEILM 

Certain error handling routines modify .USPFP+2. If this location 
does not contain SEILM, the cause of the crash may have been a failure 
in an error recovery routine. 

SEILM examines the page fail information stored in the UPT and breaks 
down the code to find the specific cause of the problem. The 
error-handling routines are described in Chapter 5 . 

Note that traps cannot be disabled and they can occur during the 
service of an interrupt. To return to the correct location, the 
Flag-PC doubleword is used. 

The page fault trap routine uses AC Block and a push-down list in 
the job's UPT. 



3.10 CLOCK LEVEL 

All functions that must be performed on a periodic basis are done at 
clock level, in exec mode. Clock level may be entered in one of the 
following ways : 

o The clock ticked when the processor was in user mode. 

o A UUO could not continue execution (was blocked) . 

o The null job was running and a new job became runnable. 

o A UUO completed and a clock tick occurred previously, during 
the processing of the UUO. 

A full cycle occurs when the processor enters clock level as the 
result of a clock tick; a partial cycle occurs when the processor 
enters clock level as the result of a job blocking or the null job 
detecting a newly runnable job. The full cycle starts at location 
CLKINT; a partial cycle starts at WSCHED or SCDCHK. 

A clock tick interrupt occurs at APR interrupt level but is 
rescheduled to run at Level 7 . The clock tick initiates accounting 
and scheduling functions, then generates a PI Level 7 interrupt. 

Only the software will generate a Level 7 interrupt. Level 7 
interrupts and ANF-10 network interrupts are controlled by the 
software. If the scheduler is running, a Level 7 interrupt will not 
be processed. 

During the full cycle, the monitor performs the following tasks: 

o User time accounting 

o System time accounting 

o Processing timing requests 

o Checking for hung devices 
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o Command processing (policy CPU only) 

o Choosing a job to run 

o Choosing a job to swap 

On a partial cycle, the system only performs user time accounting and 
then selects a job to run. A software interlock prevents a Level 7 
interrupt from interrupting the partial cycle. 

The scheduler uses the null job's push-down list, NUnPDL and AC Block 
0. When a partial or full cycle has been done, the scheduler prepares 
and, runs either a user job or the null job. 



3.11 ACCUMULATORS AND PUSH-DOWN LISTS 

The first step in finding the correct push-down list (or stack) is to 
get the right set of accumulators. When a crash occurs, the 
accumulators are saved in the following places: 



AC 


Block 


Location 







. CnCAO = . CnCAC 




1 


. CnCAl 




2 


. CnCA2 




3 


. CnCA3 




4 


. CnCA4 




6 


Portions of .Cn6 




7 


Portions of .Cn7 



= CRSHAC (for CPUO) 



The accumulators are stored when stopcode processing starts. The 
error processing routines in the monitor use a special stack, ERnPDL. 
If this is the current stack, be aware that an error may have occurred 
within the error routine. You must do the mapping, or certain stacks 
may be inaccessible. Once you have the correct accumulators, the 
stack currently in use will become readily apparent . You should check 
the stack to make sure the information in it appears to be current . 

This information is fundamental to analyzing any crash, and it may 
lead directly to the cause of the crash. Often crashes occur because 
the ACs are misused, the stack is corrupted, or there is confusion in 
the Priority Interrupt handling system. Software crashes are not 
always the result of oversights in a complicated algorithm. However, 
if the crash is due to a more obscure problem, you can use the 
information you have gathered so far to begin your investigation of 
the state of the software at the time of the crash. 

You can continue your investigation of the crash by comparing the 
state of the crash with the monitor sources. The following section 
lists the more prominent monitor modules and their functions. 



3.12 MONITOR ORGANIZATION 

Like the hardware, the software is composed of modules. Each module 
of the monitor is compiled separately, and then linked with the others 
to make up the monitor. A module is a monitor source file with 
related routines in it. For example, FILUUO deals with monitor calls 
for file access. 
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The CLOCKl module controls the following activities: 

o Perform system time accounting 

o Perform user time accounting 

o Initiate terminal command processing (COMCON) 

o Initiate scheduling (SCHEDl) 

o Initiate swapping (SWPSER) 

o Perform job context switching 

The modules called from UUO level are organized hierarchically. At 
the highest level is the UUOCON module, which is responsible for UUO 
preprocessing, dispatching to the correct routine, and cleaning up 
after the function has been performed. It also contains the code for 
some of the UUOs . 

For I/0-related UUOs, UUOCON performs device-independent functions 
before dispatching to a lower level for the device drivers . The 
drivers are responsible for calling the specific modules that issue 
the I/O instructions and start the transfers. 

Most hardware interrupts enter the CONSO skip chain, which is in 
COMMON. From there, control passes to the appropriate low-level I/O 
module, or the skip chain may call a routine in the device driver. 
Certain types of hardware generate vectored interrupts, which do not 
access the skip chain. 



3 . 12 . 1 Monitor Startup Modules 

The monitor uses the following modules when it loads and starts the 
system, discarding some of them when normal timesharing begins: 

o SYSINI initializes devices and the monitor's data base in 
preparation for timesharing. It performs system startup, 
running an operator dialog to obtain date and time, and 
performs device initialization. The monitor reclaims the 
memory space used by SYSINI and uses it for dynamic storage. 

o ONCMOD holds the routines related to disk units and file 
structures . The monitor reclaims the memory for dynamic 
storage . 

o REFSTR refreshes file structures at startup time. The 
monitor reclaims the memory for dynamic storage. 

o PATCH contains extra space to patch the monitor during 
timesharing. Patch space is reclaimed starting at the 
location referenced by PATSIZ, and continues up. SYSINI and 
patch space are preserved when the monitor is run with EDDT 
loaded. 

o AUTCON dynamically configures RHIO, RH20, DXIO, DX20, CI20, 
NIA20, and most I/O bus hardware. The monitor does not 
reclaim AUTCON memory space, because reconfiguration might be 
required during timesharing. 
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The following are optional modules that can be omitted from the 
monitor during monitor generation: 

o CPNSER holds the routines that control the processors in a 
Symmetrical Multiprocessing (SMP) system. 

o CTXSER performs job context service. 

o IPCSER handles the Interprocess Communications Facility 
(IPCF) . 

o LOKCON locks jobs in core. 

o PSISER handles the Programmable Software Interrupt (PSI) 
service . 

o QUESER controls the ENQ/DEQ facility. 

o RTTRP allows for real-time programming. 



3.12.2 Symbol Definition Modules 

Some modules contain only symbols that are used by other modules. 
They do not appear in the assembled monitor: 

o F.MAC contains feature test switches. 

o S.MAC contains system symbols. 

o DEVPRM contains hardware device related symbols . 

o DTEPRM contains DTE20 parameters. 

o NETPRM contains network parameters . 

o JOBDAT contains user job data area addresses. 

o D3 6PAR contains DECnet parameters. 

o SCPAR contains Session Control Parameters (DECnet) . 

o MACSYM contains DECnet macros . 

o KLPPRM contains CI20 parameters. 

o SCAPRM contains SCA parameters. 

o MSCPAR contains MSCP driver parameters . 

o ETHPRM contains Ethernet parameters . 

3.13 EXAMPLES OF LOCATING FAILURES 

The remainder of this chapter illustrates the crash analysis procedure 
for three types of crashes. The examples display the information 
gathered with the FILDDT patch file described in Section 2.5. 
Comments have been added here to describe the information gained from 
each command; in an actual command file, comments are illegal. 
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Example 1: IME Stopcode (Illegal Memory Reference in Exec Mode) 

•RUN MONDDT ; Run the monitor-specific FILDDT 

File: IME004 /Enter crash file name 

[Looking at file DSKT : IME004 .EXE [30, 5653, CAG] ] 
[Paging and AGs set up from the Exec Data Vector] 



diecdb/ CPUO 



cpslf / 
cpdwd/ 
cppgd [ 



CPUO 



700100, ,4325 



.cptcx[ 700100,, 4325 

.uspfw/ DFDV NTLFRE# (P) 
=113001, ,552104 

.USPFP/ CAIA =304000,, 

.USPFP+1/ P,,TIC+4 

$q/ XCT 0(T4) t4/ COMTIV+4 

l,,C0MTIV+4/ MOVEM T1,CRSHWD+3(U) 
u[ 1,, 552051 P,,NTLCKC#+4 



p/ .UUPMP+616, ,NU0PDL+22 
1, ,NU0PDL+22/ P,,CTIC0M#+5 



1, ,NU0PDL+21/ 
1, ,NU0PDL+20/ 

ttycm7? 
COMCON 



ADD '^ 

P, ,TTYCM7#+4 



$C 



.cpcml/ P,,NTLCKC#+4 

.cpisf/ .UUPMP+602, ,NU0PDL+6 
l,,NU0PDL+6[ 4,, 15772 
$q+ldbclp/ CAIL U, 43711 

.-Idbclp+ldbtit/ CCI 43705 



UNWNDC, ,PLTS5A#+1 $12t 



=1400, ,43705 
ttchks=20 


4, ,43705/ 


UNWN 


4, ,43706/ 
4, ,43707/ 
4, ,43710/ 
4, ,43711/ 
4, ,43712/ 


tt 
21 
115 
ec 
ho^e 



/Check that FILDDT found the 

/right CDB 

/DIE agrees with FILDDT 

/This CPU was in DIE 

/Mapping information saved by 

/DIE 

/ It matches that saved by 

/ SEILM 

/ The page fault word 

/A write attempt to 1,,NTLFRE 

/ The page fault PC flags 

/an address 

/at which we find part of 

/ SCNSER'' s 

/typein processing 

/However, U contains an 

/apparent PC, 

/rather than an LDB address 

/We are on the clock-level 

/ stack 

/The call within SCNSER which 

/ failed 

/ some saved data 

/The return PC from the call 

/to SCNSER 

/Where is this label defined? 

/In COMCON. 

/This is part of the TTY 

/command. 

/ COMCON' s saved LDB address 

/has the same incorrect value 

/as AC U. 

/However, COMCON' s saved PDL 

/pointer 

/points at a likely LDB 

/ address 

/And this LDB has a command 

/ line 

/pointer established 

/So we trace its input chunk 

/ stream 

/(POINT 12,addr,35) 

/These chunks are 16 words 

/ long 

/12-bit ASCII, starting next 

/ word 



/ "tt 21 115 echo" was the command being executed 
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l,,ttycmd/ PUSHJ P^SSECl 
1,,TTYCMD+1/ PUSHJ P,SAVE2 



If ,TTYCMD+2/ 
1, ,TTYCMD+3/ 
1, ,TTYCMD+4/ 
1, ,TTYCMD+5/ 
=302200, ,137 
1, ,TTYCMD+6/ 

1, ,TTYCMD+7/ 



PUSH P,U 
MOVE P1,U 
PUSH J P,CTEXT1 
CAIE T3,J0BVER 

JRST TTYCO# 

PUSH J P,NTLCKJ 



;We proceed to trace the 

/execution 

;of the command to see where 

;U got 

; clobbered. 



; "_" is character code 137, 

; so we skipped this 

; instruction, 

; and executed this code. 



;NTLCKJ is called as a result of the NETDBJ macro 



ntlckj/ PUSH J P,NTCHCK 



1, ,NTLCKJ+1/ 
1, ,NTLCKJ+2/ 
1, ,NTLCKJ+3/ 

1, ,NTLCKJ+4/ 
1, ,NTLCKC#/ 
1, ,NTLCKC#+1/ 
1, ,NTLCKC#+2/ 
1, ,NTLCKC#+3/ 



JRST NTLCKJ+3 
POP J P, 
SKIPE .CPISF 

JRST NTLCKC# 
PUSH J P,NTLCKI 
JRST ANFMDL+5 
POP P,0 (P) 
PUSH J P,@P (P) 



u/ P,,NTLCKC#+4 



/This routine checks for 

/nesting of 

/the NETSER interlock (false) 

/ It then checks for COMCON 
/ (true) 

/Get the interlock 

/ (failure branch not taken) 

/Proceed as a coroutine 

/This is the return address 
/in U! 



/At TTYCMD+2, we pushed U on the stack. We then called a coroutine. 
/We should have called NTLCKJ before we pushed U onto the stack. 

Example 2: UIL St opcode (UUO at Interrupt Level) 



.RUN MONDDT 



/Run the monitor-specific FILDDT 



File: uil002 /Enter crash file name 

[Looking at file DSKT :UIL002 .EXE [30, 5653, CAG] ] 
[Paging and ACs set up from the Exec Data Vector] 



diecdb/ CPUO 

.cpslf/ CPUO 
. cpdwd/ 
.usmuo/ CAIA 
.USMUP/ BOOTPA =20 
.USMUE/ MAPBAK+1 =7 02432 
.USUPF/ TLNE Tl,4 =603100,, 4 
$5u/ .CPCAO 
. cpca3$5u 



P/ 



.UUPMP+623, ,C4PDl+23 



C4PDH-23/ 

FREIN5#+5/ 

FREIN5#+4/ 



CAIA FREIN5#+5 
JRST FREIN3# 
PUSH J P,CALMDA# 



CALMDA#/ MOVE T1,0(U) 
CALMDA#+1/ MOVE I T2,0 
CALMDA#+2/ PUSHJ P,SNDMDC 
CALMDA#+3/ POPJ P,0 
CALMDA#+4/ JRST F 



Check that FILDDT found the 

right CDB 

DIE agrees with FILDDT 

This CPU was in DIE 

UUO PC flags 

The UUO was in the ACs 

UUO effective address 

AC block 3 was current, 

but FILDDT set up AC block 0, 

so we set up AC block 3 by 

hand. 

We have an interrupt level 

stack 

which points to this return PC 

/We had called this routine to 

/notify 

/the MDA of a new disk unit 



/Aha! 



/An editing error would seem to be responsible 
/The "JRST F" should be a "JRST CPOPJl". 
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Example 3 : KAF Stopcode (Keep-Alive Failure) 



RUN MONDDT 



;Run the monitor-specific FILDDT 



File: KAF003 /Enter crash file name 

[Looking at file DSKT :KAF003 .EXE [30, 5653, CAG] ] 
[Paging and AGs set up from the Exec Data Vector] 



diecdb/ GPUO 



. cpslf / 
. cpdwd/ 
• cppgd [ 

. cpcpi [ 

kaf loc/ 



GPUO 



700100, ,4325 

1, ,777 

XPGW @.GPKAF 



APOKAF# 

APOKAF#/ 

AP0KAF#+1/ 

AP0KAF#+2 [ 

AP0KAF#+3/ 

APRKAF/ 



GAIA =304000, , 
P,,LOKNPI $c 
4000, ,0 $s 
APRKAF 
MOVEM P, .CPSVP 



/ „UUPMP+603, ,NU0PDL+7 
NUOPDL+7/ WRSLOC,,0 
NUOPDL+6/ P, ,XMTEGH#+17 

NUOPDL+5/ P,,TTDSG1#+1 

NUOPDL+6/ P, ,XMTEGH#+17 

$q/ JRST XMTGH1# 

1,,XMTCH1#/ PUSHJ P,LOKSCI 

1, ,XMTGH1#+1/ SKIPE T1,W(U) 

$[ 100, ,0 

1, ,XMTCHl#+2/ JFFO T1,APCSET+11 

1, ,APGSET+11/ JRST @XMTDSP# (T2) 

1,,XMTDSP#/ SETZ XMTXFP# 

.+11./ SETZ XMTMIC# 

1,,XMTMIG#/ MOVE T2,ARSL0C(U) 

$[ 430400, ,2 

1, ,XMTMIG#+1/ TLNE T2,20 

1, ,XMTMIC#+2/ SKIPE KAFLOC(U) 

1, ,XMTMIG#+3/ JRST MICLG3# 

1,,MIGLG3#/ PUSHJ P,HPOS 

1,,HP0S/ PUSHJ P,SSEG1 

1,,HP0S+1/ LDB T2,LDPWID 

$lt/ 10 10 JOBBLT+4 (U) 

$[ 2000, ,50020 

$qM00=4,,120 

l,,HP0S+2/ ADD T2, JOBERR+1 (U) 

$/ -120 

l,,HP0S+3/ POPJ P,0 $ 

1, ,MIGLG3#+1/ JUMPN T2,XMT0K# 

1, ,MIGLG3#+2/ SKIPE T2,ARSL0C(U) 

$[ 430400, ,2 



;Gheck that FILDDT found the 

/right GDB 

/DIE agrees with FILDDT 

/This GPU was in DIE 

/Mapping information saved by 

/DIE 

/GONI PI, result saved by DIE 

/Where a KAF STOPCD gets its 
/ start 

/ (RSX20F does an XGT of this 
/ location. ) 

/PG flags 

/and location 

/new PG flags 

/and location 

/Where the real stack pointer 

/was saved 

/So we examine it 

/We're inside XMTEGH in 

/ SGNSER 

/ from the call of XMTGHR in 

/ TTDINT . 

/Let's look for a loop in 

/ XMTEGH . 

/We're about to restart 

/ XMTGHR 

/Gheck for output state bits 
/We have one, 
/ so this jumps . 

/Bit 11 was set, 

/so we dispatch through this 

/ location, 

/getting here. 

/These are our LDBMIG bits 

/ (true) 

/ (skipped) 

/Get horizontal position 

/Get terminal width setting 

/(POINT 8,addr,35-8) 

/ from this value 

/Dropping the low-order 8 

/bits reveals 

/a width of '^0120 

/Adding this gives zero 

/ (Branch not taken) 
/ LDBMIG again 
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1 

1 

1 

1 

1 

$ 

1 

1 

$ 

1 

1 

=2 

1 

$ 

1 

1 

1 

$ 

1 

$ 

1 

1 

$ 

1 

1 

$ 

1 

1 

1 

$ 

1 

1 

1 

$ 

1 

1 

1 

1 

1 

1 

$ 

1 

1 

1 



,MICLG3#+3/ 

,MICLG3#+4/ 

,XMT0K1#/ 

,XMT0K1#+1/ 

,XMT0Kl#+2/ 


,XMT0Kl#+3/ 
,XMTCH2#/ 


,XMTCH2#+1/ 
,ZAPBUF#/ 
05100, ,200 
, ZAPBUF#+1/ 

100, ,0 
,ZAPBUF#+2/ 
,ZAPBUF#+3/ 
,ZAPBUF#+4/ 

1400, ,37 
, ZAPBUF#+5/ 

1400, ,37 
, ZAPBUF#+6/ 
, ZAPBUF#+7/ 


, ZAPBUF#+10 
,XMTECH#/ 

200, ,200 
,XMTECH#+1/ 
,XMTECH#+2/ 
,XMTECH#+3/ 

10, ,400 
,XMTECH#+4/ 
,XMTECH#+5/ 
,XMTECH#+6/ 


,XMTECH#+7/ 
,XMTECH#+10 
,XMTECH#+11 
,XMTECH#+12 
,XMTECH#+13 
,XMTECH#+14 

100, ,0 
,XMTECH#+15 
,XMTECH#+16 
,XMTECH#+17 



TLNN T2,140 
JRST XMT0K1# 
TLNE T2,40 

JRST XMTECH# 
SKIPN KAFLOC (U) 

JRST XMTCH2# 
SOSGE T4,B00TPA(U) 

JRST ZAPBUF# 
MOVSI T1,DTEDRW#+31 



TDNE T1,W(U) 

JRST ZAPPI1# 
SETZM BOOTPA(U) 
MOVE T1,F(U) 
[ 

CAME T1,R(U) 
[ 

PUSH J P,RCDSTP# 
SKIPL SLJOBNl (U) 



654 



654 



/ JRST XMTECH# 

MOVE Tl , JOBBLT+2 (U) 
115 

TLNE Tl, 100000 

JRST ECHCNR# 

MOVE Tl , JOBBLT+3 (U) 

TLNN Tl,10 
TRZ Tl,400 
SKIPL WRSINS+1 (U) 

TRNE Tl,400 
/ TRNE Tl,3000 
/ TLNE Tl,400 
/ CAIA 
/ JRST ECHCNR# 
/ HLLZ T1,W(U) 

/ JUMPE T1,XMTIDL# 
/ PUSH J P,UNLSCI 
/ JRST XMTCH1# 



; (false) 

; (true) 
; (skipped) 

; (false) 



; (non-skip) 



; (true) 
; (skipped) 



; (true) 

; (skipped) 

; (false) 



; (true) 

; (skipped) 



; (true) 

; (skipped) 

; (false) 
; (false) 
; (true) 
; (skipped) 

/ (skipped) 



; (branch not taken) 

/We'^re back where we started. 



;We have uncovered a loop in XMTCHR processing. 

/Comparison with the source shows that this occurs when 

;TTY DEFER is set and the line is under MIC control. 

/This can be solved by inserting a "TLZ T1,L0LMIC" just before the 

; "JUMPE T1,XMTIDL" at XMTECH+15. 
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CHAPTER 4 
EXAMINING THE DATA STRUCTURES 



After you have isolated the failure in the monitor code, you will need 
to interpret the source code to make corrections . You must be able to 
read and understand the source code, and compare it to the 
instructions in the crash file. 

For this purpose, the monitor uses symbols to represent almost all 
values: bits, words, offsets, instructions, and more. Symbols make 
the code easier to read and modify. This chapter describes the 
conventions used in choosing symbolic names, and the tools for finding 
the symbols in the source code. 



4 . 1 SYMBOLS 

This section describes the types of symbols, how they are named and 
where they are stored. There is more information about symbolic 
representation and usage in the MACRO Assembler Reference Manual . 

The TOPS-10 software is made up of modules, each of which has its own 
symbolic definitions. By default, a symbol is defined and used only 
in a single module. The same symbolic name can be defined and used 
differently by different modules . 

A global symbol is available to modules other than the one in which it 
is defined. The addresses of shared tables or commonly used 
subroutines are examples of symbols defined as global. 
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4.1.1 Naming Conventions 

TOPS-10 uses a consistent scheme for naming and using symbols. This 
helps you read and understand the sources. For example, the monitor 
accumulator locations have names that are consistent throughout most 
of the monitor, and they have the following values: 

Table 4-1: Monitor Accumulators 



Number Name Description 

S Contains the I/O status word from a DDB (DEVIOS) 

while the monitor is processing I/O operations. 

1 P Contains the push-down list pointer currently in 

use . 

2 Tl is an unpreserved, temporary AC. 

3 T2 is an unpreserved, temporary AC. 

4 T3 is an unpreserved, temporary AC. 

5 T4 is an unpreserved, temporary AC. 

6 W usually contains the pointer to the process data 

block (PDB) or the tape controller data block 
(KDB) . 

7 M contains the user virtual address for getting 

and putting data during UUO execution. During 
command processing, M contains the command 
dispatch bits. 

10 U contains the Unit Data Block (UDB) address (for 

FILSER or TAPSER) , or the Line Data Block (LDB) 
address in SCNSER. 

is a preserved AC. 

is a preserved AC. 

is a preserved AC. 

is a preserved AC. 

contains the job number, high segment number, or 
disk controller data block (KON) address at 
interrupt level . 

16 F contains the DDB address during I/O. It is used 

as a temporary register in non-I/0 situations. 

17 R is a general -purpose, scratch AC. 



The uses for each accumulator may change from one release of the 
software to the next , You should always check the source code to see 
how the program uses a specific accumulator in a specific situation. 
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PI 


12 


P2 


13 


P3 


14 


P4 


15 


J 
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To restore accumulators correctly,, several standard subroutine return 
sequences have been set up. The main subroutine does a JRST to one of 
the following locations: 

Subroutine 

Name Function 

CPOPJ Regular POPJ return 

CPOPJl Increment return address and then POPJ (skip return) 

CP0PJ2 Double skip return 

TPOPJ Restore Tl and return 

TPOPJl Restore Tl and skip return 

T2P0PJ Restore T2 and return 

T2P0J1 Restore T2 and skip return 

MPOPJ Restore M and return 

FPOPJ Restore F and return 

FPOPJl Restore F and skip return 

WPOPJ Restore W and return 

JPOPJ Restore J and return 

Symbolic names for locations in the monitor are one to six characters 
in length. Usually, all six characters are used. The first three 
characters identify the data structure and type of symbol; the last 
three describe the unique word or field. 

Symbols for data structures usually take one of two forms : 

dddxxx 



. ddxxx 

where ddd or dd represents the data structure and xxx 
field or word. Some data structures are: 



represents the 



Symbol 

. COxxx 
.Clxxx 
. Cnxxx 
. CPxxx 
. PDxxx 
.USxxx 
. CTxxx 
. CXxxx 

ACCxxx 
BAFxxx 
CHNxxx 
DEVxxx 
HOMxxx 
JBTxxx 
JOBxxx 
KDBxxx 



Data Structure 



CPU data block for CPUO (in low segment) 

CPU data block for CPUl (low segment) 

CPU data block (n = CPU number) 

CPU data block for current CPU (high segment) 

Process data block 

User Process Table 

Context block offsets 

Context saved parameters block offsets 

Access table 

Bad allocation file block 

Channel data block 

Device data block 

Home blocks 

Job tables 

Job data area 

Common controller data block 
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KONxxx Disk controller data block 

LDBxxx Line data block 

NMBxxx File name block 

PPBxxx Project programmer number data block 

RIBxxx Retrieval information block 

SABxxx Storage allocation block 

STRxxx File structure data block 

TKBxxx Tape controller data block 

TTFxxx Forced command table 

TUBxxx Magnetic unit data block 

UDBxxx Common unit data block 

UFBxxx UFD data block 

UNIxxx Disk unit data block 

Byte pointers referencing fields within these data structures are 
named in the following way: 

aacbbb 

where : 

aa represents the first two letters of the three letter name 
c represents one of Y, M, B, V , S, or N 
bbb represents the name of the pointer 

For example, a pointer in the BAF block is named BAYbbb. 

Bits within words are usually defined as one of the following: 

XX . yyy 

xxPyyy 

where : 

XX is the data structure 
yyy is the bit name 

Here are some examples : 

TO. yyy Bits in CONO TIM, 

Tl.yyy Bits in CONI TIM, 

LI. yyy Bits in CONI /CONO PI, 

LP. yyy Bits in CONO/CONI APR, 

JS.yyy Bits in JBTSTS (job status word) 



4.1.2 Symbol Files and Monitor Generation 

Several of the monitor modules contain only symbol definitions . They 
are used to define the software features and hardware configuration in 
the process of building the monitor. 

The first step in generating the monitor is to run the MONGEN program 
(MONitor GENerator) . It asks a series of questions about the hardware 
configuration and the software options to be selected. For more 
information about the MONGEN program, refer to the TOPS-10 Software 
Installation Guide. 

MONGEN creates symbol-definition files that describe the aspects of 
the system. After running MONGEN, the system installer can build the 
monitor with standard source code libraries, or, if changes have been 
made to the sources, the monitor must be built from separate modules. 
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If the systems programmer does not want to make any changes to the 
standard release of TOPS-10, the programmer compiles the common 
modules and loads them with a distributed library file of the 
remaining monitor modules . 

It is coimnon practice, however, to make modifications to the TOPS-10 
source code. If changes have been made to one or more TOPS-10 source 
modules, the modules of the monitor must be assembled separately to 
build a library file. 

Next, the MONGEN files must be assembled with the monitor's common 
modules, which are: 

o COMMOD defines the disk data base. 

o COMDEV defines all other devices. 

o COMMON describes the CPU, memory, scheduler, job tables, and 
so forth. 



4.2 READING THE CODE 

There are two important sources of information in analyzing system 
crashes: the crash file and the monitor source code. The key to 
successful crash analysis is to be able to compare the crash file and 
the source code . Refer to the TOPS-10 MACRO Assembler Reference 
Manual for information about the source code and assembler language 
conventions . 



4.2.1 How to Use a CREF Listing 

The listings of the monitor source code should be cross-referenced 
(CREF) listings. You will find a CREF listing more useful than 
unassembled source code because CREF produces a sequence-numbered 
assembly listing, followed by tables showing where symbols are defined 
and referenced. To find a symbol in a module, you need only look in 
one of these tables, which points to a line number in the assembly 
listing. The CREF program is described in the TOPS-10 User Utilities 
Manual . 



4.2.2 Macros 

A macro is a set of frequently used instructions in a sequence that 
can be called with a single pseudo-instruction. A macro allows the 
system programmer to supply arguments to a single instruction, which 
the assembler expands to the desired instruction (s) . Macros make it 
difficult to read the code, however, unless you understand the purpose 
of some commonly-used macros. 

Several macros are used to define symbols . These macros are defined 
in S.MAC: 

o XP (A,B) defines the global symbol A as being equal to B, but 
DDT will not display A (A==:B) . 

o ND (A,B) defines A as a global symbol equal to B using the XP 
macro, if A has not already been defined. 
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There are many other commonly-used macros in the monitor ;■ including: 

o $XHGH, $HIGH, $LOW, $CSUBS, and $ABS, which place code in the 
extended high segment, high segment, low segment, common 
subroutines, and an absolute physical location, respectively. 
Code usually goes in the monitor's high segment, which is 
write-protected; data goes in the low segment, which is 
writable. $ABS is usually used to place data in physical 
Page of memory (Words 0-777) . 

o Ordinarily, an instruction in a user program is executed 
entirely in user address space, and an instruction in the 
monitor is executed in the executive address space. But to 
facilitate communication between the monitor and users, the 
monitor can execute instructions to refer to locations in the 
other address space. This feature is implemented by the 
previous context execute (PXCT) instruction. The following 
macros allow you to execute PXCT: 

1. EXCTUX moves information from the user's address space to 
the monitor. 

2. EXCTXU moves information from the monitor's address space 
to the user's. 

3. EXCTUU moves information from one location in the user's 
address space to another. 

o The USERAC and EXECAC macros generate code to switch between 
accumulator blocks . USERAC switches to AC Block 1 . EXECAC 
switches to the monitor's AC block. If no argument is given, 
the switch is made to AC Block 0. If an argument is given, 
the AC block specified by the argument is used. 



4.2.3 Conditional Assembly 

Parts of the monitor are assembled on an optional basis, depending on 
conditions defined by an assembler IF statement. 

F.MAC has most of the symbol definitions that are used for conditional 
assembly. Most symbols are of the form FTxxxx, where FT stands for 
Feature Test and xxxx is the specific option. Some of the feature 
test symbols and the functions they enable are: 

FTKLIO KLIO processor 

FTKSIO KSIO processor 

FTMP SMP (multiple-processor) system 

FTDUAL Dual -ported disks are supported 



4.2.4 Finding Symbols 

When trying to find a symbol in the monitor, you should follow these 
steps : 

1 . Check the symbol table at the back of the CREF listing you 
are currently looking at. If one of the numbers after the 
symbol name has a pound sign (#) next to it (as in number#) , 
the symbol is defined on that line of the code. If the 
symbol appears in the CREF listing with no line numbers that 
have pound signs, the symbol is global, or it is defined in a 
universal file. 
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2. If a symbol is defined in a universal file, check your CREF 
listings of S.MA.C, DEVPRM.MAC, DTEPRM.MAC, NETPRM.MAC, 
MACSYM.MAC, and JOBDAT.MAC. If the symbol is not defined in 
any of these modules, the symbol is probably global. 

3. If the symbol is not defined in the source module or the 
universal files, you must obtain a GLOB listing of the 
monitor. The GLOB listing points to the modules where global 
symbols are defined and used. Search the symbol tables at 
the back of those modules. (GLOB creates listings of global 
symbols from binary files. It is described in the TOPS-10 
User Utilities Manual . ) 

4. If you are not successful in searching the listings, run the 
monitor-specific FILDDT and use the "symbol?" instruction to 
find the module where it is defined. If you type a symbol 
name followed by a question mark, FILDDT displays the module 
where it is defined. 

Monitor parameters used by certain modules are often 
associated with global symbols that are defined in those 
modules. LINK can detect the parameters that are assigned 
different values by different modules. FILDDT lists only one 
module where each global symbol is defined, and displays a 
"G" next to global symbols. If a symbol is not global, 
several modules may be listed as containing the symbol. You 
can unlock the local symbols for a certain module by issuing 
the following FILDDT command: 

module$ : 

The monitor uses many fixed and dynamic data structures for job 
control, for memory management, and for device control. Some of the 
data structures that are important for crash analysis are described 
briefly in the following sections. For more specific information 
about the contents of these data structures, refer to the TOPS-10 
Monitor Tables descriptions. 



4.3 JOB-RELATED DATA STRUCTURES 

Information about a job is kept in the monitor's low segment or in 
per-process address space (such as the UPT and JOBDAT) . Most of the 
following data structures are job tables, and have JBT as the first 
three letters of the symbolic name (an exception is TTYTAB) . Most job 
tables have one entry in the table per job. Some of these tables also 
have entries for high segments, because the monitor sometimes treats 
high segments like jobs. 

The following job tables hold information about the status and 
condition of the job: 

o JBTSTS, JBTST2, and JBTST3 contain the current state of the 
job, including the processor queue, execution status, 
swapping status, event wait condition, and whether the job is 
logged in. 

o JBTCQ and JBTCSQ hold the processor queue number, subqueues, 
and scheduler class for each job. These tables are organized 
as a series of linked lists. 

o JBTSWP holds the disk address of the swapped-out job. 
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The following tables hold the features and options for the job: 

o JBTPRV holds the job's privileges. 

o JBTSPL holds the spooling bits for the job. These control 
how and when requests to spooled devices (LPT, PLT, and so 
forth) are handled. 

o JBTSCD holds the job's scheduler class. 

o JBTWCH controls the WATCH information displayed by the 
monitor for the job. 

o JBTLIM holds the CPU run-time limit for the job. The monitor 
checks this value before processing batch jobs. 

The following tables describe the user and the program being run: 

o JBTNAM holds the program name . 

o JBTPPN holds the pro ject -programmer number. 

o JBTLOC holds the ANF-10 node number for remote spooling. 

o JBTUPM, a component of the SPT, points to the physical page 
of this job's UPT when the job is swapped in. 

The following tables are used to point to the location of another 
job-related table: 

I 

I o JBTSGN contains the address of the job's high segment 

I descriptor blocks . 

o JBTPDB holds the address of the job's Process Data Block (the 
PDB) . 

The Process Data Block (PDB) stores more job-related information, 
including: 

o User name (in SIXBIT) 

o Accumulated run-time, core and disk usage 

o Virtual memory limits 

o IPCF information 

o Current program name and directory 

o The job's search list 

o Context flags, quotas, and chain pointers 

The words in the PDB are named .PDxxx, where xxx is the specific word. 

The remainder of the job-related information is stored with the job 
itself in JOBDAT or the UPT. JOBDAT holds the user accumulators when 
the job is not running, the starting address of the program, the 
addresses of DDT and the symbol table, and other locations required to 
run the program. 
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4.4 CPU DATA STRUCTURES 

The CPU Data Block (CDB) contains most of the CPU-specific 
information. On a multi-processing system of two or more KL 
processors, the monitor maintains a different CDB for each processor. 

The CDB is is divided into two sections: one for constant definitions 
and the other for variable definitions . The constants area holds such 
information as the following: 

o CPU number 

o Instructions to execute in certain situations, such as device 
interrupts 

o Bit masks 

o Hardware constants 

The variables area stores such information as : 

o Stopcode information 

o Hardware error information 

o Performance information 

o Frequency of certain events 

o Per-CPU patch space 

The CDB words are named .CPxxx or .Cnxxx, where n is the CPU number 
and XXX is the unique symbol for the word. On a single-CPU system, 
the .CPxxx format is always valid. In a multi-CPU system, .CPxxx 
refers to the current CPU (or, in FILDDT, the CPU that is currently 
mapped) . To refer to the data on a CPU other than the one you are 
currently accessing, use the .Cnxxx formation, replacing n with the 
CPU number (0 through 2) . 

The COMMON module contains the CWRD macro to define constants and 
variables in the CPU Data Block (CDB) . CWRD is called in the 
following way: 

CWRD (nam, val, len, Ibl) 

where : 

nam is the word name 

val is the optional value to store in this address 

(default=0) 
len is the optional length of storage area (default=l) 
Ibl is the optional alternate lable for old-style CPUO 

references 

For example, the following instruction defines .CnOK as a global 
symbol with a value of -1: 

CWRD (0K,-1) 

For example, the following instruction defines .CnACN as a word in the 
CDB variables area, with the alternate name APRSTS : 

CWRD (ACN, , 1, APRSTS) 
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The scheduler uses a series of tables to control the use of the CPU. 
Some of the scheduler tables are: 

o QBITS determines how the scheduler should move a job from one 
wait state to another. 

o SSCAN and SQSCAN tell the scheduler the order and direction 
the run queues should be scanned to find a runnable job. 

o Transfer tables control the destination queue for requeued 
jobs . 

The AVALTB table contains flags to indicate whether a sharable 
resource has become available. A sharable resource is a portion of 
the monitor that can only be used by one process at a time. 

Some of the sharable resources are: 

Name Resource 

AU Alter UFD (one per UFD, per structure) 

CX PDB/context block interlock word (one per job) 

DA Allocate disk space (one per disk unit) 

EV Use executive virtual memory 

MM Memory management (for modifying the data base) 

REQTAB contains the number of jobs waiting for each resource. A value 
of -1 in REQTAB indicates that the resource is available; a value of 
zero means that a job has the resource and no other job is waiting. 

INTTAB describes each hardware interrupt routine. Each two-word entry 
contains the PI level, the address of the DDB (or prototype DDB) , and 
the CPU to which the device is connected. 



4.5 MEMORY DATA STRUCTURES 

The monitor uses PAGTAB and PT2TAB to allocate user and monitor memory 
space (usually referred to as "core") . The tables contain one word 
for each page of physical memory. A job's allocation of pages is 
maintained as a forward linked list using PAGTAB, and as a backward 
linked list with PT2TAB. All the pages for a job are linked using the 
right half of a PAGTAB and PT2TAB entry. PAGPTR contains the starting 
address for the linked list of free pages. The left half of the 
PAGTAB and PT2TAB entries contain bits describing how the page is 
used: whether it is locked, locked in executive virtual memory, and 
so forth. The monitor uses PT2TAB to obtain information about 
swapped-out pages . 

MEMTAB also has one entry for each page in memory. The monitor uses 
MEMTAB during swapping and paging requests, to keep track of where 
pages are stored in the swapping area and which page to transmit next . 

The monitor also maintains areas of dynamic storage called free core, 
allocated in four-word chunks, using a bit table to determine which 
chunks are in use and which are not . 
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4.6 command processing tabues 

The command processor uses several tables to verify and control 
I monitor commands, including COMTAB, DISP, and UNQTAB. C0MTB2, DISP2, 
I and UNQTB2 are used to describe SET commands. COMTBC, DISPC, and 
I UNQTBC are for customer use . 

TTFCOM is the forced commands table. This table is used if the 
monitor determines that a job must execute a command immediately, 
regardless of the job's current state. The monitor does not place the 
commands in the TTFCOM table into a terminal input buffer before 
processing the command. 



4.7 UUO PROCESSING TABIiES 

UUOTAB contains the addresses of the operator-dependent UUO routines . 
The addresses are arranged in order of UUO opcode, with one half word 
devoted to each address. The UUO handler verifies whether the UUO is 
valid and dispatches to the address stored in UUOTAB. If the UUO is 
illegal, control passes to an error routine called UUOERR. 

The tables UCLJMP and UCLTAB are used for the CALL and CALLI UUOs . 
UCLTAB contains the names for the CALL UUOs; UCLJMP contains the 
addresses of the CALL/CALLI routines. 



4-8 I/O DATA STRUCTURES 

The most dynamic and interrelated data structures in the monitor are 
those related to I/O. The data structures that are common to almost 
all I/O operations are the Job Device Assignment table (JDA) , the 
device data block (DDB) , and user I/O buffers. Other data structures 
exist to control specific types of hardware: disk or tape units, 
device controllers, or software I/O channels. For certain devices 
(such as disk) , an extra level of organization is imposed: the 
logical file structure, requiring additional data structures. 



4.9 THE JOB DEVICE ASSIGNMENT TABIJB 

The Job Device Assignment table (starting at USRJDA in the UPT) holds 
the addresses of the DDBs currently in use by the job. It is indexed 
by the software channel number. When the user issues a UUO to 
initiate I/O, a software channel number must be supplied, which is 
associated with the device or file to be accessed. More channels are 
available in the extended channel table, stored in funny space. 
Extended channel table entries are in the same format as the JDA 
table. The contents of .USCTA in the UPT point to the extended 
channel table . 

The left half of the JDA entry for a channel contains status bits that 
indicate which UUOs have been successfully completed for this channel. 
Following are some of the status bits, which are defined in S.MAC: 

Bit Symbol Meaning 

INITB An OPEN or INIT has been done on this channel . 

1 IBUFB INIT specifying input buffers was done. 

2 OBUFB INIT specifying output buffers was done. 

3 LOOKB LOOKUP was done. 
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4 


ENTRB 


5 


INPB 


6 


OUTPB 


7 


ICLOSB 


8 


OCLOSB 


9 


INBFB 


10 


OUTBFB 


11 


SYSDEV 


12 


RENMB 


13 


RESETB 



EXAMINING THE DATA STRUCTURES 



ENTER was done. 

INPUT was done. 

OUTPUT was done. 

CLOSE (input side of channel) was done. 

CLOSE (output side of channel) was done 

INBUF was done . 

OUTBUF was done. 

System device, or [1,4] for disk area. 

RENAME UUO in progress. 

RESET UUO in progress . 



4.10 THE DEVICE DATA BLOCK 

The monitor uses the Device Data Block (DDB) to control each device. 
The information in the DDB comes from a monitor call and is read by 
the interrupt handling routine to perform the I/O. The handler 
records the status of the operation in the DDB. The monitor and the 
user can read the status of the I/O operation from the DDB. For 
example, the monitor can detect a hung condition by checking a timer 
in the DDB. 

User programs can include the same instructions to perform I/O with 
disk devices, magnetic tapes, and line printers, because the format of 
the DDB is similar for all devices. The monitor handles the devices 
differently by handling the DDBs differently and by ignoring any 
information in the DDB that is not relevant to the specific device. 
For example, the monitor creates DDBs for single-user devices when the 
system comes up; these DDBs are never deleted. The monitor simply 
updates the information in the data block. For sharable devices, such 
as disk devices, the monitor creates DDBs dynamically in the user's 
funny space, when a channel is opened. The DDB for the channel is 
deleted when the channel is closed. Spooled devices, such as line 
printers, are handled in a similar manner. 

A device on an ANF-10 network front-end requires a special kind of 
DDB, because remote stations can have line printers or card readers. 
I When a user first accesses the remote device, NETSER creates a DDB for 
I the device . COMDEV contains the prototype network DDB . 

NETDEV contains the I/O routines for specific network devices. For 
example, the RDXSER routine, in NETDEV, handles RDA devices, and the 
TSKSER routine handles intertask communication. 

I DTESER contains the DTE device handling routine for DECnet front-ends 
1 (DN20s running MCB software) . The DTE DDB is dynamically created for 
I the purpose of loading and dumping the front-end memory. 

All DDBs include the following locations: 

o DEVNAM contains the SIXBIT device name. 

o DEVBUF contains the addresses of the user buffers . 

o DEVMOD describes the type of device . 

o DEVIOS is the I/O status word. 

o DEVSER contains a pointer to the next DDB and the address of 
the dispatch table. 
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Most devices are configured dynamically by the monitor. A prototype 
DDB exists for each type of device. When a recognized hardware device 
is detected by the monitor^, a DDB is created and the contents of the 
prototype DDB are copied into the new DDB. Then, specific information 
(device names, unit numbers, and so forth) are filled in. Prototype 
DDBs are linked into the DEVLST chain. They may also by found by 
indexing into DDBTAB using the . TYxxx value for the device in 
question. For example, . TYMTA has a value of 2. DDBTAB+2 contains 
the address of the prototype magtape DDB. 



Device 



Module 



DDB 



Hardware Interface 



Card reader 

Card punch 
Line Printer 



Magtape 

Plotter 

Paper tape reader 

Paper tape punch 



CDRSER CRIDDB CRIO I/O BUS 

DCRSER DCRDDB CD20/RSX-20F 

CDPSER CDPDDB CPlO/CPlOD I/O BUS 

DLPSER DLPDDB LP20/RSX-20F 

LP2SER LP2DDB LP20/UNIBUS (KSIO only) 

LPTSER LPTDDB BAlO/LPlOO I/O BUS 

TAPUUO TDVDDB All interfaces 

PLTSER PLTDDB XYlO I/O BUS 

PTRSER PTRDDB CR04 I/O BUS 

PTPSER PTPDDB CR04 I/O BUS 



4.11 FINDING DDB INFORMATION 

The following example shows how to look at a crash file to find the 
DDBs and other information about I/O. In this example. Job 7 was 
running LPTSPL. You must first issue the mapping command ($6U) , to 
map the UPT through Job 7, rather than through the UPT for the job 
that was currently running. A typical command sequence might be: 



JBTNAM 7$6T/ LPTSPL 
JBTUPM 7[ 42000,, 152 



-n$6U 



where n is the CPU number of the CPU that is currently mapped. 
The commands to look at the user job device assignment table are 



USRJDA [ 


506000, ,65334 


UPMP+652 [ 


506000, , 65414 


UPMP+653 





UPMP+654 





UPMP+655 





UPMP+65 6 





UPMP+657 





UPMP+660 





UPMP+661 





UPMP+662 





UPMP+663 





UPMP+664 





UPMP+665 





UPMP+666 





UPMP+667 





UPMP+670 





mmands to 


display the dev 


6T 65334/ LPTO 


65414/ LPTl 



/Channel 
; Channel 1 
; Channel 2 



/Channel 17# (octal) 



Both devices are printers, controlled by LPTSPL 
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The left half of each JDA entry contains bits indicating the UUOs 
executed for that channel . The left half of the JDA entry shown above 
contains 506000, which indicates Bits 0, 2, 6, and 7 turned on. These 
bits are set for the following UUOs: 

Bit OPEN/INIT 

Bit 2 OUTBUF 

Bit 6 OUTPUT 

Bit 7 CLOSE (input side, as input is not allowed in LPTs) 

The user buffers are the next source of information. Find the output 
buffer for LPT2 61 by examining the left half of the DEVBUF word in the 
DDE, which holds the address of the output ring header: 



65414+DEVBUF/ 45150,, 



; output -header, , input -header 



The user buffers are always in user address space. To examine 
locations in user address space, switch mapping to the user job. 
JBTUPM shows that the UPT starts at 152; therefore, the command to 
switch mapping to user space is: 

152$1U 

Now you can examine the contents of the output ring header: 



45150/ 
45151/ 
45152/ 



44351 
10700, ,0 
-1 



/Current buffer addr+1 
;Byte pointer 
; Byte count 



Location 45150 contains the address of the second word of the current 
buffer, which contains the address of the next buffer in the buffer 
ring, and so forth. You can locate all the buffers in the ring using 
the same method: 



44351/ 
44551/ 
44751/ 
44151/ 



176, ,44551 
176, ,44751 
176, ,44151 
176, ,44351 



/Buffer 1 

/Buffer 2 

/ Buffer 3 

/Buffer 4 



Therefore, there are four buffers set up. The right half of the 
header word points to the next buffer in the ring. The left half 
holds the use bit and the buffer size. Bit is the use bit (BF.IOU), 
and its setting indicates the following state in the following types 
of buffers : 



Input Buffer 
Output Buffer 



Buffer Empty 


1 



Buffer Full 

1 




In the left half of the header words listed above. Bit is off, 
indicating that the output buffers were full. The remainder of the 
left half holds the buffer size, in this case, 176 (octal) words. 

To read the contents of the first buffer, use the following commands: 

$$7T 

44151/ (apHt@ 

44152/ } 

44153$0T/ GLE 

GGGGGGGGGGGG 



File format: ASCI I Print mode: ASCI I /DELETE _'^L 
RKE^RKEIRKE^RKR IIIIIIIII PPPPPPPPPPPP EEE 



EEEEEEEEEEEEEEEE 
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The rest of the buffer contains the banner page printed by LPTSPL 
immediately before printing a file. LPTSPL had just begun printing a 
file when the system crashed. 

Job 7 is using two DDEs, but it is also important to check the 
extended channel table for the job. In this case, it reveals more 
DDEs . Note that the left half of the pointer to the extended channel 
table does NOT contain a section number, as might seem immediately 
apparent. Only the right half of this word is a valid pointer to 
data: 

.UPMP+USCTA[ 21,, 34 12 00 

341200 [ 651500, ,340000 /Channel 20 

341201 [ 651400, ,340063 /Channel 21 

341202 [ 651400, ,340146 /Channel 22 

These DDEs are in funny space, so they are disk DDEs. They contain 
the following file names: SYS :LPFORM. INI [1, 4] , DSKC:ERROR.FS [ 6, 6] , 
and DSKC:GRIPE.SRJ[1,2] . The DDEs are displayed as follows: 

340000/ SYS 

340000 DEVNAM/ LPFORM 

340000 DEVEXT/ INI ( 

340000 DEVPPN[ 1, ,4 

340063/ DSKC 

3400 63 DEVNAM/ ERROR 

340063 DEVEXT/ FS A 

340063 DEVPPN[ 6, , 6 

340146/ DSKC 

34014 6 DEVNAM/ GRIPE 

340146 DEVEXT SRJ 

340146 DEVPPN[ 1, ,2 

Eecause the banner page that was being printed has the file name 
GRIPE, it is clear that the third disk DDE is associated with the file 
that was being printed at the time of the crash. 



4.12 LINE DATA BLOCKS (LDBS) 

The monitor uses terminals in two different ways: they are the means 
to enter commands directly to the monitor, and they are also subject 
to control by user programs. To serve both functions, there are two 
data structures: the terminal DDE and the Line Data Block (LDE) . 

LDBs contain information about a terminal line. There is one LDE for 
each terminal and it is built when the monitor is initialized. LDBs 
are not created dynamically/ they continue to exist as long as the 
system is in operation. This allows users to type commands on 
terminals even though they are not logged in, and permanent LDBs speed 
response because the monitor does not have to spend the time 
allocating an LDE. The code to allocate and initialize the LDBs is in 
SCNSER, and it is discarded when system initialization is complete. 
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In general, an LDB contains: 

o Pointers to input and output chunks (terminal I/O buffers) 

o Counts of how many characters are currently in the chunks 

o Pointer to its associated DDB 

o Line status bits 

o Line characteristic bits 

o Position counter 

o MIC information 

o Break characters 

o Count of characters to echo 

You can use LINTAB to locate the LDB entry for a terminal line. 
LINTAB contains one entry for each terminal in the system (including 
CTYs and PTYs) . Use the TTY number as the offset into LINTAB, The 
LINTAB entry (a fullword global address) points to the LDB, and the 
first word of the LDB points to the terminal DDB (if the terminal DDB 
exists) . 

4.13 THE SCNSER DATA B2VSE 



SCNSER processes user input and calls the appropriate module to handle 
the I/O. The SCNSER data base is composed of the following virtual 
memory sections : 



Data 

LINTAB 
DSCTAB 
DDB pool 
LDBs 



Memory Section Used for 



Section 
Section 
Section 
Section 4 
Chunk pool Section 4 



Translates line no. to LDB addr 
Translates modem no. to line no 
TTY device data blocks 
Line data blocks 
Buffers 



4.14 TERMINAL CHUNKS 

Terminal data is usually stored in eight-word buffers called TTY 
chunks. In 12-bit ASCII mode, the terminal chunk size varies. 
Examine the value of TTCHKS to see the current size of a terminal 
chunk. The terminal chunk starts with a pointer to the previous 
chunk, and a pointer to the next chunk, followed by the character 
data. 

Chunks are maintained as doubly linked lists, using half word links 
relative to Section 4 . Each terminal line can potentially have four 
linked lists of chunks: one for input, one for output, a list for 
filler characters, and a list for out-of-band characters. When chunks 
are no longer needed by a terminal line, they are returned to a free 
list of chunks . The LDB contains pointers to the chunks . 
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Each character in a chunk is stored as a. 12 -bit byte, permitting a 
maximum of 21 characters to be stored in a chunk (3 to a word) . In 
reading the characters in terminal chunks using FILDDT;. use the $12T 
command to break up the 36-bit word into 12-bit bytes (4 bits for 
flags + 8 bits for data) . 

The monitor keeps all the chunks in a pool. The TTYINI routine, in 
SCNSER, initializes the chunks, allocating space for them and creating 
the links . 

The location TTFTAK points to the first free chunk in the pool . When 
a terminal needs a chunk, it gets the chunk pointed to by this 
location. TTFPUT points to the last free chunk in the list and 
returned chunks are stored after this chunk. TTFREN contains the 
number of free chunks in the system. The following macros place 
characters in the chunks and remove characters from the chunks : 
LDCHK, LDCHKR, and STCHK. The following macros are useful in terminal 
handling. However, these macros should not be called when SCNSER 
interrupts are enabled. 

o LDCHK takes a character out of a chunk, and does not give 
back used chunks (useful when echoing input) . 

o LDCHKR takes a character out of a chunk and returns used 
chunks to the pool, if necessary. 

o STCHK puts a character in a chunk, allocating chunks from the 
pool, if necessary. 



4.15 TERMINAL DEVICE DATA BLOCKS 

Terminal device data blocks are allocated from the TTY DDB pool as 
jobs are created, or as the terminal is assigned by a job on another 
terminal . Some types of information that are stored in the terminal 
DDB are: 

o Pointers to user buffers 

o Device and logical names for the terminal 

o I/O status information (DEVSTA) 

o Device mode information (DEVMOD) 

o CPU number of the CPU that owns this terminal 

o Pointer to the LDB 

Every job has a terminal DDB for its controlling terminal, whether the 
job is attached or not. Terminal DDBs are created when a job number 
is assigned (that is, when a program is run) and when a terminal is 
assigned or OPENed by another job. If the job is not logged in when 
the program finishes, the DDB is deleted. If the job is logged in, 
the DDB remains until the job logs out or detaches. 

TTYTAB is a table in COMMON that has one entry per job and points to 
the DDB of the controlling (attached) terminal of the job. If a 
program opens a software channel for a terminal, an entry is made in 
the channel table for the terminal . 
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LDBs and DDEs are linked when a job is created or a terminal is 
attached to a job. These links are destroyed when: 

o You log out or detach your job. 

o A node goes down when the terminal is connected. 

o You hang up the modem of a terminal that is connected. 

o You release a terminal on a software channel . 

TTYATI attaches the terminal to the job when the job is created; 
TTYATT attaches the terminal for the ATTACH command. 



4-16 FINDING TERMINAL I/O INFORMATION 

The following example shows how to extract information from the 

terminal chunks for a job. In this case, you are examining Job 17, 

which is running PIP. First, look at TTYTAB, which points to the 
terminal DDE for the job: 



TTYTAB+21 [ 
102206$6T/ 



102206 
TTY124 



As the first word of the block verifies, it is a terminal DDE. 
find the LDE by looking at the DDBLDB word: 

10220 6+;DDBLDB[ 4,, 450430 

4,,450430[ 102206 

The DDE pointer in the first word of the LDE is correct . 
examine the LDE : 



Next, 



Next , 



4, 


, 450431 [ 





4, 


,450432 [ 


100000, ,0 


4, 


,450433[ 


10000, ,0 


4, 


,450434[ 





4, 


, 450435 [ 





4, 


,450436[ 





4, 


, 450437 [ 


1400, ,426522 


4, 


,450440[ 


1400, ,426522 


4, 


,450441[ 





4, 


, 450442 [ 





4, 


, 450443 [ 


301400, ,422450 


4, 


,450444[ 


301400, ,430276 


4, 


,450445[ 


2137 



;Ptr to put output characters 
; Ptr to take output characters 
;No. of characters in output 

The pointers are PDP-10 byte pointers. The memory address in the 
right half points to the terminal chunk, which can be displayed by: 

4, , 430276$12T/ <space><space>< 



The pointer is in the middle of the chunk. Determine the chunk 
in order to know where the chunks begin and end: 

TTCHKS=10 



size. 
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Now, start from a few locations back, and you can see 



4, ,430275/ 
4, ,430274/ 
4, ,430273/ 
4, ,430272/ 
4, ,430271/ 



10 

MEM 
DT 



=417221 



The contents of location 4,, 430271 are a backward pointer in the left 
half, and the location of the next chunk in the right half. The chunk 
itself holds the text "DT MEM 10 <***>." 

By examining the next chunks, you can deduce the entire message: 



DT 


MEM 


10 


<***> 


666405 


9-Jul-80 




BOOTH 


DOC 


10 


<***> 


157023 


27-JU1-79 


4A(46) 


BOOTH 


EXE 


28 


<***■> 


411354 


26-Jul~82 


4A(46) 


BOOTH 


HLP 


2 


<***> 


500576 


5-Jan-75 




BOOTH 


MAC 


108 


<***> 


010501 


27-JU1-79 




BOOTH 


MEM 


29 


<***> 


544353 


27-Jul-79 




BOOTS 


DOC 


35 


<***> 


352703 


17-JU1-79 




BOOTS 


EXB 


10 


<***> 


556224 


26-JU1-82 




BOOTS 


MAC 


92 


<***> 


764007 


31-JU1-79 




BT128K 


EXB 


10 


<***> 


605464 


26-JU1-82 




BT256K 


EXB 


10 


<***> 


556224 


26-JU1-82 




WIBOOT 


EXE 


32 


<***> 


607553 


30-NOV-79 


7 (12) 


WLBOOT 


EXE 


32 


<***> 


325717 


30 -Nov- 7 9 


7(12) 


WSBOOT 


EXE 


24 


<* **> 


631454 


30-NOV-79 


7(12) 


WTBOOT 


DOC 


18 


<***> 


451662 


28-Jun-79 




WTBOOT 


MAC 


29 


<***> 


007472 


20-Jul-79 




DML6A 


DOC 


3 


<***> 


331675 


7-Mar-79 




DMPFIL 


EXE 


16 


<***> 


071372 


16-JU1-80 


6A(7) 


DMPFIL 


MAC 


34 


<***> 


661675 


7-Mar-79 




DMPFIL 


MEM 


5 


<***> 


077054 


8-Mar-79 




COPY 


EXE 


8 


<***> 


605250 


17-JU1-80 


7 (101) 


CPY007 


DOC 


4 


<***> 


507510 


8-Mar-79 




DTC007 


DOC 


3 


<***> 


204110 


8-Mar-7 9 




DTCOPY 


EXE 


20 


<***> 


456574 


17-JU1-80 


7 (101) 


DTCOPY 


MAC 


43 


<***> 


303311 







The user was reading a BACKUP tape directory listing when 
crashed. 



the system 



4.17 TAPE DRIVES 

The data structures for tape drives parallel the actual hardware 
components. Depending upon the hardware interface, a magtape 
controller may be connected to as many as 15 drives. The software has 
up to 15 tape unit data blocks (TUBs) connected to a tape controller 
data block (KDB) , which then points to a channel data block (CHN) . 

There is one TUB for each tape unit in the system. It contains the 
unit name, pointers to the DDB and controller, error counts, tape 
label information, and a pointer to the lORB (I/O request block, the 
request to the controller outlining the I/O transfer) . The first word 
in each TUB is the SIXBIT name of the tape unit, in the form: 

MTxy 

where x = the controller name and y = the unit number. For example: 

MTAO 
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The prototype TUBs are : 
Symbol Units 

DXIUDB DX10/TX01/TX02 

T78UDB TM7 8 

I TCXUDB TCIOC 

TM2UDB TM02/TM03 

I TMXUDB TMIOB 

1 TSIUDB SA10/TX01/TX02 

TX2UDB DX20/TX02 

The KDB identifies a controller and there is one for each tape 
controller in the system. It holds the name of the controller, a 
pointer to the next KDB, the channel command list, a list of TUBs 
I owned by the controller, and controller-dependent information. In the 
I monitor, KDBs are pointed to by KDBTAB+ . TYxxx . The name of the 
controller is stored in the first word as MTn, where n is the 
controller number. The KDB also points to the channel it is connected 
to . 

The prototype KDBs are : 

DXIKDB 

T78KDB 
I TCXKDB 

TM2KDB 
I TMXKDB 
I TSIKDB 

TX2KDB 

Channel data blocks exist for channels that are connected to any type 
of controller. They hold enough information to start and monitor the 
channel transfer, including: 

o Error counts 

o Retry information 

o Channel status 

o Channel queue 

At system startup, AUTCON creates one magtape DDB for each unit on 
each controller. The start of a magtape DDB can be obtained from 
DDBTAB+ . TYMTA . The magtape DDB is named: 

MTxu 

where : 

X is the alphabetic controller name (A for controller 0, B for 

controller 1, and so forth) 
u is the unit number 

A special magtape DDB (called a Label DDB) is required for the tape 
label processor (PULSAR) . This is needed so I/O can be performed by 
two different jobs (the user job and the job running PULSAR) , while 
the device remains assigned to the user job. The label information is 
stored in the Tape Unit Data Block (TUB) , which is common to both the 
magtape and the label DDB. 
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The name of a label DDB is in the form: 

I *■ Lxu 

The values of x and u are the same as shown above for the magtape DDB 
The label DDB has the same format as a magtape DDB. 



4.18 DISKS 

Disks are the most complex peripheral I/O devices in a timesharing 
system. They are shared among jobs, using a logically structured file 
system to store data and prevent destructive interference. The basic 
unit of disk storage is one block (equal to 128 words) . 

TOPS-10 organizes information into logical groups known as files. The 
contents of a file are referenced by the file specification, which 
uniquely identifies the file. A file specification has four 
components : 

o A file structure name, which identifies the disk drive or 
group of disk drives where the file is stored 

o An ordered list of directory names (MFD, UFD, and SFDs, if 
any) 

o A file name of one to six alphanumeric characters 

o A file extension of zero to three alphanumeric characters 

A file structure is a logical device name that refers to one or more 
physical disk units. Using the file structure name, the user job need 
never know the exact physical unit where data is stored. 

The directory where a file is stored helps to uniquely identify the 
file. TOPS-10 organizes files by using file structures. User File 
Directories (UFDs) , and Sub-File Directories (SFDs) . A UFD or SFD is 
itself a file, and contains a list of all files for a user, and a 
pointer for accessing those files . 

The Master File Directory (MFD) points to all the UFDs on a particular 
disk file structure. There is one MFD for each file structure, 
containing the names and addresses of all the UFDs on that structure. 

Each UFD can optionally contain Sub-File Directories (SFDs) . An SFD 
is a logical group of .files within the UFD. SFDs can contain their 
own sub-file directories, which can be nested to a level of five SFDs 
in a single UFD. 

The UFD is named with the user's PPN, in brackets. For example, the 
user with PPN 10,507 has the following UFD: 

[10,507] 

You specify an SFD by typing the name of the UFD, followed by the name 
of the SFD (up to six alphanumeric characters) . For example, the UFD 
[10,507] could contain a file called FIRST. SFD. To access the files 
in this SFD, the user specifies the following directory: 

[10, 507, FIRST] 
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In the SFD, the user keeps a file called SECOND. SFD, which points to a 
nested SFD . To access files in the nested SFD, the user types the 
following directory name: 

[10, 507, FIRST, SECOND] 

The monitor does not write the data on disk in physically consecutive 
disk blocks. The monitor must allocate disk space effectively in a 
dynamic situation where users are constantly creating, deleting, 
modifying, and appending to variable-length files. Therefore, the 
monitor segments disk space into blocks and stores files in space that 
is available throughout the file structure. 

To maintain this complex storage system, the monitor must maintain 
some amount of overhead data for retrieving files and allocating disk 
space. The RIB (Retrieval Information Block) contains the retrieval 
information for the file. 

A RIB is a block on the disk that contains retrieval pointers to the 
blocks making up the entire file. The UFD points to the first RIB for 
each file. Each retrieval pointer in the RIB describes a contiguous 
block of data called a "group." The retrieval pointer contains the 
first physical disk address of the group and the number of blocks that 
are in th© group. UFDs and MFDs also have RIBs to describe their 
locations on the disk unit. 

A retrieval pointer contains the following information: 

o The number of clusters in this group 

o The cluster number where the group starts 

o The checksum for the group 

One of the following conditions is possible, if the left half of the 
retrieval pointer is zero: 

o If Bit 18 = 1, Bits 19 through 35 contain the logical unit 
number of the next unit to get data from. This allows one 
RIB on one unit to hold pointers to data on another unit in 
the same structure. 

o If the right half is zero, there is no more data in the file. 

If a file needs more than one RIB to retrieve the data, it has 
extended RIBs at the start of subsequent groups . The monitor also 
writes an extra copy of each RIB as the last block pointed to by the 
RIB, for disk error recovery purposes. That copy is known as the 
spare RIB. The first RIB is known as the prime RIB. 

Each disk unit contains a HOME block, which describes the file 
structure that contains the disk unit, and points to the MFD. Blocks 
1 and 10 (decimal) on the disk contain the HOME block, which records 
the following information: 

o The file structure to which this unit belongs, and the unifs 
position within the structure 

o The characteristics of the unit and file structure 

o A pointer to the MFD 

The monitor uses the HOME block to find the MFD when the file 
structure is mounted for a user. 
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The monitor keeps information about used disk blocks in the Storage 
Allocation Tables (SAT blocks) . The SAT block on each file structure 
is stored as SYS: SAT. SYS. Each bit in the SAT block represents a 
group of contiguous disk blocks called a cluster. 

The smallest unit of data on disk that the monitor can allocate is the 
cluster, which is composed of a specific number of disk blocks. A 
small disk unit might use a cluster size of 3 blocks (600 words) . If 
the monitor must allocate space to a file that is smaller than 
200 (octal) data words, an entire cluster is allocated. When the 
cluster size increases, fewer SAT blocks are required for storage 
allocation information; with fewer reads/writes to the SAT, a smaller 
number of operations is required to assign and release disk space. 

Large clusters save memory at the expense of disk space. Because disk 
space is allocated in clusters, short files result in wasted space if 
the cluster size is too large. 

The MFD contains pointers to the UFDs on the disk unit. The UFD 
contains a two-word entry for each file in the UFD. The UFD entry 
specifies the file name in the first word, and file extension in the 
left half of the second word and a pointer to the file in the right 
half of the second word called the compressed file pointer (CFP) . The 
CFP is the 18 -bit address of the RIB of the file, pointing to the 
first supercluster of the file. A supercluster is a set of clusters 
stored contiguously on disk. A file always starts at the supercluster 
boundary, but one file may fill many superclusters of disk space. 

The number of blocks per cluster is usually equivalent to the number 
of blocks per supercluster. However, if the total number of clusters 
on a file structure is greater than 262,143, the clusters are 
regrouped into superclusters such that the number of superclusters is 
less than or equivalent to 262,143 (the largest number that can be 
stored in the right half of the second word in the UFD entry) . The 
number of clusters per supercluster is stored in the HOMe block, and 
in the STR block when the monitor is running. 



4.18.1 Finding Information on Disk 

The following example shows how to use FILDDT to retrieve information 
stored on a disk, using the /U switch to look at a disk unit. This 
example shows how to locate the contents of the file 
DSKA:H616.TXT[64,2] ; DSKA is mounted on RPBl . 

First, run the monitor-specific FILDDT (MONDDT in this manual) , and 
specify the physical disk unit you want to examine, followed by the /U 
switch: 

.R MONDDT 

File:RPBl:/U 

/U requires that you be logged in as [1,2], and instructs FILDDT to 
treat the disk as addressable. 

The first data structure to use in examining the file is the HOME 
block. It holds pointers to other files, and can always be found at 
Blocks 1 and 10 (decimal) on a disk. To access the first word of the 
HOME block, specify location 200 to FILDDT. Each block is 
128 (decimal) words, which equals 200 (octal) . 
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Remeniber to convert disk block numbers to FILDDT addresses by 
multiplying by 200. If converting cluster addresses, multiply by 
200*n, where n is the cluster size. For example, if the cluster size 
is 5, use the following calculation to specify the block number. (The 
numeric base of the following calculations are indicated by (8) for 
octal and (10) for decimal) . 

Block 15(10) = Block 17(8) * 200 = 3600(8) in FILDDT 

Cluster 11(10) = Cluster 13(8) * 5 = Block 67(8) = 67 * 200 = 15600 

To examine the HOME block, type the following: 

200/ HOM /Name of HOME block 

201/ DSKAOl ;Unit ID 

202/ 

203/ 

204/ DSKA /Structure name 

The pointer to the MFD's RIB is at offset HOMMFD : 

200+HOMMFD/ 4204 

This location contains the block number. All subsequent addresses are 
cluster numbers. The size of a cluster is stored in the HOME block at 
location HOMBSC: 

200+HOMBSC/ 12 /Blocks per supercluster 

200+HOMBPC/ 12 /Blocks per cluster 

In this case, a cluster is 10 (decimal) blocks. 

The MFD's RIB confirms that you have the correct RIB: 

4204*200/ 777653, ,41 

1,, 41001/ 1,,1 /Owner of file 

1,, 41002/ 1,,1 /File name 

1,, 41003/ UFD)EC /File extension in left half 

Examine the first retrieval pointer to find the MFD itself. The right 
half of the contents of the first word in the RIB contains the offset 
within the RIB to the first retrieval pointer. The left half of the 
first word is the negative of the maximum number of retrieval pointers 
that may be stored in the RIB. 

1,, 41001+41/ 400000 /Unit change pointer to Unit 

1,, 41002+41/ 4010,, 100332 /1st real retrieval pointer 

The first cluster of the MFD is number 332. This corresponds with 
Block 332*12=4204 (octal) , the address of the RIB (stored in HOMMFD, 
shown above) . The RIB is stored in the first block of the 
supercluster when the file is initially allocated. The monitor checks 
to gee if the RIB address is the same as the first group of data. If 
so, the monitor retrieves the second block for data. Look at 1,, 41200 
(4204*200) for the MFD: 



1, ,41200/ 


Irrl 


/ [1,1] UFD 


1, ,41201/ 


UFD. : 


= 654644,, 332 


1, ,41202/ 


1,,4 


/ [1,4] UFD 


1, ,41203/ 


UFD 


= 654644, ,3 


1, ,41204/ 


3, ,3 


/ [3,3] UFD 


1, ,41205/ 


UFD > 


= 654644, ,336 


1, ,41206/ 


10, ,1 


/ [10,1] UFD 


1, ,41207/ 


UFD ? 


= 654655,, 337 
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1,, 41210/ 


1,,2 


; [1,2] UFD 


1, ,41211/ 


UFD @ = 


654644, ,340 


1,, 41212/ 


1..5 


; [1,5] UFD 


1, ,41213/ 


UFD A = 


654644, ,341 


1, ,41214/ 


1..3 


; [1,3] UFD 


1,, 41215/ 


UFD B = 


654644, ,342 


1, ,41216/ 


64, ,2 


/ [64,2] UFD 


1,, 41217/ 


UFD E = 


654644, ,345 



The first word of each two-word MFD entry contains the UFD name. The 
second word contains the UFD extension in the left half and the 
supercluster address of the RIB in the right half. The pointer to the 
UFD RIB is located at supercluster 345 (assuming the supercluster size 
is equivalent to 1) . 



345*12*200/ 777653,, 41 
1, ,RNA2CB+71[ 1,,1 
1, ,RNA2CB+72[ 64,, 2 
1, ,RNA2CB+73/ UFD) EC 

345*12*200 41/ 400000 
1,,RNA2CB+133/ 1000,, 345 



/Owner of file 

;File name 

; LH = file extension 



/Location of UFD 



Again, the RIB takes up the first block of the cluster. Add 
200 (octal) to the address of the RIB to get the first data block of 
the UFD. If the cluster size is 1 block, you have to read the 
retrieval pointer for the first data block. 

345*12*200+200/ F601 
1, ,RNA3CB+71/ EXE &S 
1, ,RNA3CB+72/ D602 
1, ,RNA3CB+73/ EXE GN 



1,,D3KDB+1/ H616 

l,,DSKDB+2/ TXT1T4 =647064, , 16424 



The location of the RIB for the file is at Supercluster 16424 



16424*12*200/ 777653,, 41 
44, ,262001/ 64, ,2 
44,262002/ H616 
44,262003/ TXT)CT 

16424*12*200+41/ 400000 
44, ,262042/ 1655, , 616424 



;LH = file extension 



Finally, you reach the file, which contains: 



44, ,262200/ 

DATA 
44, ,262201/ 
44, ,262202/ 
44, ,262203/ 
44, ,262204/ 
44, ,262205/ 
44, ,262206/ 



A AT 
TIME 
OF SE 
R0 62. 
CRASH 
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44,^262207/ VMA, 

44,, 262210/ PC=53 

44, ,262211/ 7771 
44, ,262212/ 

(FRO 

44, ,262213/ M KLD 

44, ,262214/ CP AL 

44, ,262215/ L COM 

44,, 262216/ MAND) 



Reformatting to make reading easier yields the following: 
DATA AT TIME OF SERO 62. CRASH 



VMA, PC=537771 
(FROM KLDCP ALL COMMAND) 



4.18.2 In-Core File Infoinnation 

To keep accurate information in a readily accessible place, the 
monitor maintains information about the following, in memory: 

o Structure information 

o Device information 

o File information 

o User information 

To access a file structure, the monitor keeps a file structure data 
block called STR. It contains the name of the structure, allocation 
information, swapping information, and pointers to MFD and HOME 
blocks. The STRs are stored in a linked list, each entry pointed to 
by the system table TABSTR. A structure is identified by the offset 
into TABSTR where its entry is stored. The word SYSSTR points to the 
first structure. The STR also points to the physical units in the 
file structure. 

The Unit Data Block (UDB) contains information about the physical disk 
unit, including: 

o Physical unit name 

o Pointers to related UDBs 

o Pointers to HOME blocks and SAT blocks 

o Unit parameters (cluster size, and so forth) 

The UDBs for each structure are linked and each UDB points back to the 
STR. Because of these linkages, the STR points only to the first UDB. 
The UDB addresses are dynamically assigned by AUTCON. 
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The STR accesses the following data structures: 

o SABs (Storage Allocation Blocks) are in-core copies of the 
SAT tables . Copies of the SATs are read into memory at 
system startup and updated on disk after every write 
operation. 

o SPTs (Storage allocation Pointer Tables) contain pointers to 

all SAT blocks for a unit. Do not confuse the SPTs (Storage 

allocation Pointers Tables) used in disk 1/0, with the SPT 

(Special Pages Table) used in mapping user jobs into physical 

memory . 

o The PWQ (Position Wait Queue) is an ordered list of DDEs that 
have positioning requests for that unit. 

The controller data block (KON) is connected to the UDB and contains 
information about the device controller for that unit. The channel 
data block (CHN) is linked to the KON and contains information about 
the hardwar channel associated with that disk controller. The CHN 
holds the transfer wait queue (TWQ) for the disk drives on that 
channel . 

The PWQ and the TWQ contain information for performing I/O requests, 
and the order in which they are to be serviced. Both of these queues 
are required to drive a disk device. The format and naming scheme is 
the same as the channel data block for tape drives . 

Only the static state of the file system can be described here. In a 
timesharing environment, jobs can modify files while the same files 
are being used by other jobs. The monitor requires special 
information for the contention-free management of the files. To keep 
track of currently open files, the monitor'' s data base shows the 
versions of all open files for all PPNs at any given time. 

The file data base is organized using the following data structures: 

o The PPB , the PPN data Block, contains information about all 
files for a specific PPN. There is one PPB for each PPN that 
has open files. All PPBs for all jobs are linked together; 
the first is pointed to by SYSPPB. 

o The NMB , the Name Block, contains the file names of all open 
files on all file structures for a PPN. There is one NMB for 
each open file of each PPN, regardless of the number of 
versions of the file that are in existence. A word in the 
PPB points to the the first NMB in a list. 

o The ACC, the access table, contains information needed to 
gain access to a specific version of a specific file. The 
location of the first RIB is stored here, with the file 
structure number. The ACC entries are linked in a ring 
through the NMB . 

At any time there are two possible versions of a file: the 
current version and the superseding version. Usually there 
is only one ACC; but while the file is being superseded, both 
the old and new versions of the file have ACCs linked to the 
NMB. There may be several ACCs if the file exists on more 
than one file structure, or older versions of a file are 
still open. 
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o The UFB is a UFD data block. The monitor keeps a UFB for 
each UFD for each file structure for your job. Each UFB 
contains the first retrieval pointer to the UFD. The PPB 
contains a pointer to the UFB for the first structure. 

Every LOOKUP to a file is recorded in the PPB, the NMB, and the UFB. 
If the monitor cannot find a file, it marks the NMB to indicate that 
the file does not exist. Likewise, if the UFD does not exist, the 
monitor marks the UFB accordingly. There are two words in each of 
these data structures to contain this information. The first word is 
the KNO word, short for KNOW. This is set to tell whether the monitor 
checked to see if the file or UFD exists. If the bit is zero, a disk 
read will be required to find out if the file exists. If the bit is 
one, the second word, the YES word, is valid. If the YES word 
contains 0, the file does not exist; if the word is one, the file does 
exist, and there is probably information about it in the PPB and NMB. 

The goal of this information storage is to reduce the number of disk 
reads for discovering whether a file exists and where it is stored. 
This is especially useful during debugging, when the same group of 
files are used over and over again (source program, compiler, and 
linker, for example) . Of course, not all the file information can fit 
into memory. The disk data structures are managed like a cache, where 
the oldest entries are discarded in favor of those accessed more 
recently . 

The disk DDB is extremely important because it is the central source 
of information for all disk I/O operations. It contains pointers and 
links to many other data structures, including: 

o The current retrieval pointers being used by the disk 
routines, and the block numbers to which the pointers refer. 

o Pointers to the UDB and STR where the file resides . 

o Pointers to the buffer ring header and user buffers. 

o The PWQ and the TWQ, which make a linked list of DDBs waiting 
to use the disk and channel. 

o Pointers to the ACC and UFD. 

Disk DDBs are created when the device is OPENed and a software channel 
is created; they are deleted when the channel is closed. Disk DDBs 
are stored in the user's funny space. 



4.18.3 The Software Disk Cache 

The in-core file information that is being input or output can be 
cached in memory, allowing the monitor to access disk information more 
efficiently. The following data blocks are used in caching disk I/O 
information. 

The data structures for the software disk cache are two doubly linked 
lists, a list header, and a hash table. Each entry in the list 
contains forward and backward pointers for each of the two lists, 
(.GBNHB, .CBPHB, .CBNAB, and .CBPAB), a UDB address (.CBUDB), a block 
number (.CBBLK), and a pointer to the address in free core where the 
block is (.CBDAT) . For statistical purposes, the entry also contains 
a count of the number of times the block has been accessed since it 
was included in the list (.CBHIT). 
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The list header points to the two linked lists. The first linked list 
is the "access" list. The most recently accessed block is at the top 
of the list; the least recently accessed block is at the end. The 
access list is linked through the .CBNAB/ .CBPAB words. 

The second linked list is the "free" list. It contains a list of all 
blocks that are not currently in use and do not appear in the hash 
table. The free list is linked through the . CBNHB/ . CBPHB words. 

The hash table consists of pointers to the free list corresponding to 
the blocks that hash to the same position. Thus, the hash table 
consists of separate list heads for the lists of blocks that hash to 
that position in the hash table. 

At initialization time (CSHINI) , all the blocks are allocated and 
linked into the free list. They are also linked into the access list. 
The hash table entries are linked to themselves because the table is 
empty . 

To find an entry, given its UDB and block number, use the block number 
as the offset into the hash table. Use the hash table entry as a list 
head, following the list until you either find a match, or return to 
the header. This is done with the CSHFND routine. In general, these 
lists are very small, most commonly only one or two blocks. 

The main cache handling routine is CSHIO, which will simulate I/O from 
the cache, doing the necessary physical I/O to fill and write the 
cache. Note that this is a write-through cache, so no sweeps are 
required, and the data in the cache always reflects the blocks on 
disk. 



4.18.4 Finding In-Core File Information 

The following example finds the file information stored in memory 
Job 3. First, you must set up paging for the job: 



for 



.COEPT/ .EOEPT 
$Q'1000$U 

JBTNAM+3$6T/ ACTDAE 
JBTUPM 3[ 42000,, 354 
.$6U 

Then search for the assigned DDBs : 



/Program name 
;UPT at page 354 
/Mapping command 



USRJDA [ 

FOPBUF#+52 [ 

FOPBUF#+53 [ 

FOPBUF#+54 [ 

F0PBUF#+55 [ 

FOPBUF#+56[ 

FOPBUF#+57 [ 

FOPBUF#+60[ 

F0PBUF#+61[ 

FOPBUF#+62[ 

FOPBUF#+63 [ 

FOPBUF#+64 [ 

FOPBUF#+65[ 

FOPBUF#+66[ 

FOPBUF#+67 [ 

FOPBUF#+70[ 

.USCTA[ 20,, 741200 



/Channel 
/Channel 1 
/Channel 2 



/Channel 15 
/Channel 16 
/Channel 17 

/Check for extended channels 
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741200[ 


564200, ,740000 


/Channel 


20 


741201[ 


560200, ,740066 


; Channel 


21 


741202[ 


474000, ,740154 


/Channel 


22 


741203[ 


403000, ,740242 


/Channel 


23 


741204 [ 


441100, ,740330 


/ Channel 


24 


741205[ 


474100, ,740416 


/ Channel 


25 


741206[ 





/Channel 


26 


741207 [ 





/ Channel 


27 



In this case, there are six open DDEs, all in the extended channel 
table. They point to DDEs in funny space, so they must be for disk 
files. Looking closer, you can find the names of the files. The 
examples below show how this was done for the first three DDEs listed 
above . 



740000$6T/ ACT 
DDB2 0: 

DDB20+DEVFIL$6T/ USAGE 
DDB20+DEVEXT$6T/ OUT 
DDB20+DEVPPN[ 1,,7 



/Label this as the DDE 
/for Channel 20. 

/ ACT : USAGE . OUT [1,7] 



740066$6T/ ACT 

DDB21: 

DDB21+DEVFIL$6T/ FAILUR 

DDE21+DEVEXT$6T/ LOG 

DDE2H-DEVPPN[ 1,,7 

740154$6T/ ACT 

DDE22+DEVFIL$6T/ USE JOE 

DDE22+DEVEXT$6T/ EIN W 

DDE22+DEVPPN$6T[ 1,,7 

Now examine the USEJOB.BIN file 
unit the file is on: 



/Label this as the DDE 
/for Channel 21. 

/ ACT : FAILUR . LOG [1,7] 



/ ACT : USE JOE . EIN [1,7] 
From the DDE, you can find which 



DDE22+DEVUNI/ 142314, , 142314 /original UDE,, current UDB 



142314$6T/RAJ3 
RAJ3: 

RAJ3+UDBKDE[ 136770 
RAJ3+UNILOG$6T/ DSKAO 
RAJ3+UNIHID$6T/ DSKAO 
RAJ3+UNISYS[ 142444, ,46000 
RAJ3+UNISTR[ 145324 
RAJ3+UNICHN[ 142444 
RAJ3+UNIK0N[ 142444 



/Physical device name 

/Label the UDE 

/KDE 

/ Logical name within structure 

/HOME block ID name 

/Next UDE in system, , bits 

/Next UDE for STR 

/Next UDE on channel 

/Next UDE on controller 



The unit is RAJ3, which is part of the structure DSKA. 
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Included in the UDB is a pointer to the structure data block (STR) 



145324$6T/ 
DSKA: 
DSKA+1 [ 
DSKA+2 [ 
DSKA+3 [ 
DSKA+4 [ 
DSKA+5 [ 
DSKA+6 [ 
DSKA+7 [ 
DSKA+10[ 
DSKA+11[ 
DSKA+12 [ 
DSKA+13[ 
DSKA+14 [ 



DSKA 

145274, ,10 

142314, ,0 

1 

3, ,41577 

3, ,41600 







266532 

777777, ,777014 

7 

410, ,512304 



STR name 

Label the CHN 

Next STR, , STR number 

First UDB for STR, ,K for CRASH.EXE 

Number of units in STR 

Quota words 



; Mount count 

/First retrieval pointer to MFD 



There are two other methods for locating a disk structure. The first 
is to start with SYSSTR and follow the links to each structure; 



SYSSTR/ 247103,, 1 
247103$6T/ SIRS 
247104[ 240137, ,15 
240137$6T/ BADP 
240140[ 110521, ,14 
110521$6T/ 7A 
110522[ 145324, ,1 
145324$6T/ DSKA 



/Pointer in left half 
;lst STR in linked list 



;2nd STR in list 
; 3rd STR in list 
;4th STR in list 
Or, with the file structure number, you can index into TABSTR; 



TABSTR/ 777733,, 1 
TABSTR+1/ 110521 
TABSTR+2/ 145324 
145324$6T/ DSKA 

Notice that the links started by SYSSTR are not in the same order as 
TABSTR. 

You can use the UDB to find several other structures : 



RAJ3 
RAJ3 
RA.J3 



UNIQUE/ 
UNIPTR/ 
UNISAB/ 7,, 31271 



From the UDB, you can find the KDB 

RAJ3: UDBKDB/ 136770 

136770$6T/ RAJ 

RAJ: 

RAJ+1[ 76237 

RAJ+2 [ 7 

RAJ+3[ 136704 

RAJ+4[ 777740, ,137063 



/Position wait queue 

/ -Length, , addr of swap SAT 

/First SAB in ring, , addr of SPT 



/Ptr in UDB to KDB 

/Controller name 

/Label this 

/Next controller on system 

/CPU accessibility mask 

/KDBCHN — CHN 

/KDBIUN -- Initial pointer to units 
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You can get the channel data block from the KDB: 

RAJ KDBCHN/ 136704 ; KDB pointer to CHN 



;-l if channel idle 

; Label it 

;Next CHN,, last UDB with er^ror 

/Error information 



The other file information can be found by starting with SYSPPB and 
following pointers to the correct PPB, NMB, and ACC. (DEVACC in the 
DDB also points to the ACC.) 



136704/ 




CHN: 




CHN+1/ 142750, 


.0 


CHN+2/ 




CHN+3/ 




CHN+4/ 





SYSPPB/ 


120140, ,0 


/Pointer to first PPB 


120140 [ 


l.,4 


/Pro ject, , programmer number 


120141 [ 


120440, ,0 


/Next PPB in system, , 


120440 [ 


l.,7 


/Pro ject, , programmer number 


PPB: 




/ Label it 


PPB+1 [ 


120560, ,0 


/Next PPB in system, , 


PPB+2 [ 


120450, ,0 


/First UFB this PPN, , 


PPB+3 [ 


120460, ,0 


/First NMB this PPN, , bits 


PPB+4[ 


6 


/Use count 


PPB+5 [ 


410 


/KNO bits 


PPB+6[ 


410 


/YES bits 


PPB+7 [ 





/ Interlock bits 


'ou can look for the file 


USEJOB.BIN in the NMB; 


120460$6T/ USAGE 


/File name - USAGE 


120461[ 


120510, ,0 


/Next NMB, ,0 


120510$6T/ FAILUR 


/File name - FAILUR 


120511 [ 


120540, ,0 


/Next NMB, ,0 


120540$6T/ USEJOB 


/File name - USEJOB 


NMB: 




/ Label it 


NMB+1 [ 


122670, ,0 


/Next NMB, ,0 


NMB+2 [ 


26325 


/Compressed file pointer 


NMB+3 [ 


120550,, 425156 


/ACC,, file extension in SIXBIT 


NMB+4 [ 


110000, ,0 


/File structure number 


NMB+5 [ 


400 


/KNO bits 


NMB+6[ 


400 


/YES bits 


NMB+7 [ 


2 


/ Use count 



And finally, you can get to the ACC from the NMB: 



120550 [ 156 
ACC: 
ACC+1 [ 
ACC+2 [ 
ACC+3 [ 
ACC+4 [ 
ACC+5 [ 
ACC+6[ 
ACC+7 [ 



120542, ,200000 

1100, ,26325 



110020, ,120440 

222136, ,410 

145 

55744, ,332136 



/Highest block allocated 

/Label the ACC 

/NMB, ,bits 

/First retrieval pointer 

/Dormant ACCs 

/Bits, ,PPB 



The ACC points back to both the NMB and PPB. Note, however, that the 
ACC may point to another ACC, which may point to the NMB. This is 
ascertained by examining the last digit of the left half of the NMB. 
If the last digit is 2, as in this example, the left half of the NMB 
ACC word points to an NMB. If the digit is not 2, the NMB points to 
another ACC. 



4-32 



EXAMINING THE DATA STRUCTURES 



The PPB also points to the UFB. 

DDB22 DEVUFB/ 120450 

PPB PPBUFB/ 120450,, 

120450/ 377777, ,700521 

UFB: 

UFB+1[ 122420, ,775400 

UFB+2[ 100,, 52166 

UFB+3 [ 5 

UFB+4 [ 110000,, 

UFB+5[ 104,, 

UFB+6 [ 

UFB+7 [ 



;DDB pointer to UFB 

;PPB pointer to UFB 

/Total blocks left this UFD 

/Label it 

;Next UFB, ,bits 

; First retrieval PTR to this UFD 

;Bits 

/File structure number 

;N if job N owns AU for this UFB 

; Non-zero if waiting for AU 

;=1 if UFD has empty data blocks 



In all cases, check the Monitor Tables Descriptions and the source 
listings to find the interconnections between the data structures and 
how to interpret what is stored in them. 
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CHAPTER 5 
ERROR HANDLING ROUTINES 



The monitor reports hardware and software problems by displaying error 
messages on the CTY, but these messages include only a small portion 
of the information that the monitor stores in its database. 

This chapter will show you how to take a message from the CTY and use 
it to trace through the dump to obtain more information. This 
involves working with the APR interrupt routine, the page fail trap 
routine, and the stopcode routine. You can use this information to 
deduce the scope and nature of the problem more accurately. 

The error routines of the monitor are designed to handle both software 
and hardware errors . When software errors are detected, control 
usually jumps to an error handling routine for processing. Hardware 
errors, however, can interrupt processing and sometimes halt the 
system. 



5.1 HARDWARE ERRORS 

You can use the CTY message to trace an error to the actual hardware 
that failed. The following types of hardware-related messages may 
appear on the CTY . 

The most serious hardware error is indicated by one of the following 
messages : 

7N0N-REC0VERABLE MEMORY PARITY ERROR IN MONITOR 

[CPU HALT] 

or 

7N0N-EXISTENT MEMORY DETECTED IN MONITOR 

[CPU HALT] 

In this case, the error is so serious that the processor is halted 
immediately and no further error processing can be done. 
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A second type of problem is an AR/ARX parity trap, indicated by the 
following message: 

************ 

CPUO AR/ARX PARITY TRAP AT USER PC 401123 ON dd-mmm-yy 

JOB 1 [SYSTAT] WAS RUNNING 

PAGE FAIL WORD = 000000^,00011 

MAPPED PAGE FAIL ADDRESS = 547000 ,, 560271 

INCORRECT CONTENTS = 000000^,000000 

CONI PI, = 000000, ,000377 

RETRIES UNSUCCESSFUL, OFFENDING LOCATION ZEROED 

************ 

Another type of parity trap is a page table parity trap, indicated by 
the following: 

************ 

CPUO PAGE TABLE PARITY TRAP AT EXEC PC 414555 ON dd-mmm-yy hh:mm:ss 

PAGE FAIL WORD = 000000 ,, 00011 

CONI PI, = 010000, ,020377 

************ 

A CPU interrupt due to a parity or NXM error is reported as: 

************ 

CPUl PARITY ERROR INTERRUPT AT USER PC 343413 ON dd-mmm-yy hh:mm:ss 

JOB 2[WBKI] WAS RUNNING 

CONI APR, = 003002, ,312022 

CONI PI, = 010000, ,020377 

ERROR INVOKED BY A message 
************ 

This report can have several variations, depending on the CPU and the 
specific error. The monitor can include any of these error messages: 

CACHE WRITE-BACK FORCED BY A SWEEP INSTRUCTION. 

CHANNEL STATUS WORD WRITE. 

CHANNEL DATA WORD WRITE. 

CHANNEL READ FROM MEMORY. 

CHANNEL READ FROM CACHE. 

CPU WRITE TO MEMORY (NOT CACHE) . 

CACHE WRITE -BACK FORCED BY A CPU WRITE. 

CPU READ OR PAGE REFILL FROM MEMORY. 

PAGE REFILL FROM CACHE. 
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After this or other errors, the monitor may also attempt to check for 
problems by scanning memory for parity errors or nonexistent memory. 
A memory scan can produce one of the following reports : 

MEMORY PARITY SCAN INITIATED BY CPUO ON dd-mmm-yy hh:mm:ss 
NOTHING WAS FOUND 

■kkkkkkkkkkkk 

NON-EXISTENT MEMORY SCAN INITIATED BY CHANNEL 1 ON CPUl ON dd-mmm-yy 

hh : mm : s s 

NON-EXISTENT MEMORY DETECTED: 

AT 314243 (PHYS . ) 

■kkkkiddckickk 

The channel number (CHANNEL 1) listed in this message refers to the 
sofware channel data block (CHN) number, not an RH20 channel. 

Memory parity errors or nonexistent memory errors on a channel produce 
a special message: 

CPUl CHANNEL MEMORY PARITY ERROR ON dd-mmm-yy hh:mm:ss 

DEVICE IN USE IS RPA2 

CHANNEL TYPE IS type 

TERMINATION CHANNEL PROGRAM ADDRESS = 000477 

TERMINATION DATA TRANSFER ADDRESS = 251470 

LAST THREE CHANNEL COMMANDS EXECUTED ARE: 

760000, ,252777 

760000, ,251777 

760000, ,250777 

The CHANNEL TYPE listed in this message may be DFIOC, DXIO, RH20, 
CI20, NIA20, or SAIO. Hardware errors signal the software in either 
of two ways: by a processor (APR) interrupt or by a page fail trap. 
APR interrupts are usually generated on the highest PI level, because 
CPU errors are serious and must interrupt other devices . When 
notified of such errors, the monitor reads the hardware registers and 
takes the appropriate action. 

To obtain more information about the error and the state of the 
monitor, you must examine the dump. It is important to understand how 
the monitor handles hardware errors . The following sections describe 
the routines in the monitor that handle errors . 



5.1.1 APR Interrupt Routine 

The routine to handle APR interrupts is APnINT, where n is the CPU 
number. It is defined by a macro in COMMON, and handles all the 
possible conditions that could cause a processor interrupt, which are: 

o Cache-sweep-done 

o Power fail 

o Timer timeout (clock tick) 

o I/O page fail error 
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o NXM error 

o Cache directory parity error 

o MB parity error 

o Address parity error 

o SBUS error 

A clock tick or cache-sweep-done interrupt happens frequently and the 
monitor deals with them quickly. The other conditions require more 
extensive processing. 

MB and NXM errors undergo even more analysis and eventually produce 
one or more of these error reports: CPU parity error or NXM 
interrupt, a memory scan, or the nonrecoverable error message. 



5.1.2 Page Fail Trap Routine 

Page fail traps are caused by one of the following conditions : 

o Page fault 

o Proprietary violation 

o AR/ARX parity error (KLIO only) 

o Page table parity error (KLIO only) 

o Page refill failure (KLIO only) 

o Address break (KLIO only) 

o Illegal section number (KLIO only) 

o Illegal indirection (KLIO only) 

o Non-existent device or register (KSIO only) 

o Hard memory error (KSIO only) 

o NXM error (KSIO only) 

Some of these conditions are the result of normal operations, such as 
an address break, proprietary violation, or page fault. Others are 
handled as error conditions . The page fail word describes the type of 
page fault that occurred. The trap handler is located at SEILM in 
APRSER. 

The APR interrupt routine and the page fail trap routine use the same 
push-down list, ERnPDL, once an error has been detected. The power 
fail routine uses another push-down list, PWFPDL. 

The channel error report is produced at the interrupt level of the 
device that was doing the transfer. This report usually occurs for 
disk and tape devices . 

If a parity error is detected in fast memory, DRAM, or CRAM, the EBOX 
stops immediately by turning off its clocks. The front-end processor 
performs any diagnostic action that is necessary. 
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5-1.3 Saved Hardware Error Information 

The error handling routines store information about hardware errors in 
the CPU Data Block (CDB) . Some of those locations in the CDB are: 

.CnACN (APRSTS) CONI APR, 
.CnAEF APR error flag 

Parity Error Information; 

o .CnTPE contains the total number of parity error words in 
memory . 

o .CnSPE contains the total number of nonreproducing parity 
errors in memory. 

o .CnMPA contains the memory parity address for this CPU. 

o .CnMPW contains the memory parity word for this CPU. 

o .CnMPP contains the memory parity PC for this CPU. 

o .CnSBO contains the SBUS Diag instruction. 

o .CnSOA contains the answer from the SBUS Diag instruction. 

o .CnSBl contains the SBUS Diag Function 1 instruction. 

o .CnSlA contains the answer from the SBUS Diag Function 1 
instruction . 

NXM Information: 

o .CnTNE contains the total number of NXMs for this CPU. 

o .CnSNE contains the total number of nonreproducible NXMs for 
this CPU. 

o . CnMNA contains the first address found with NXM. 

AR/ARX Parity Information: 

o .CnPBA contains the physical address that registered bad 
parity on last AR/ARX parity trap. 

o .CnTBD contains the contents of the bad word on the last 
AR/ARX parity trap. 

o .CnNPT contains the total number of AR/ARX parity traps. 

o .CnAER contains the results of RDERA on a parity/NXM 
interrupt . 

o .CnPEF contains the results of CONI APR on a parity/NXM 
interrupt . 

o .CnPPC contains the PC on the last AR/ARX parity trap. 

o .CnPFW contains the page fail word on the last parity trap. 

o .CnHPT contains the number of hard AR/ARX parity traps. 

o . CnSAR contains the number of soft AR/ARX parity traps. 

o .CnPTP contains the total number of page table parity traps. 
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5.1.4 Hardware Error Checking 

The KLIO processor is made up of the following hardware components, 
the EBOX, the MBOX, and various interfaces and buses. The EBOX, short 
for Execution BOX, is responsible for the execution of the 
instructions. The MBOX, short for Memory BOX, controls transfers to 
and from memory, cache, channels, and the EBOX. 

The EBOX is composed of the following: 

o Instruction Register (IR) receives the instruction code from 
the Arithmetic Logic Unit and passes it to the CRAM/DRAM for 
execution . 

o Dispatch RAM (DRAM) and Control RAM (CRAM) hold the microcode 
that implements the PDP-10 instruction set. 

o Arithmetic Logic Unit (ALU) is the major working area of the 
processor. It has three fullword registers: 

AR (Arithmetic Register) 

BR (Buffer Register) 

MQ (Multiplier/Quotient Register) 

The first two registers also have fullword extensions: ARX 
and BRX. 

o Fast Memory (FM) contains the accumulators (ACs) . The EBOX 
has eight AC sets. 

o Virtual memory address (VMA) keeps the PC and sends the 
virtual address to the pager in the MBOX. 

o Virtual memory address adder (VMA AD) helps the VMA in its 
computations . 

o Program Counter (PC) holds the virtual address of the next 
instruction to be executed. 

The MBOX is composed of: 

o Pager (also known as the hardware page table) , which holds 
512 (MCA20) or 1024 (MCA25) mapping entries from the EPT or 
UPT. 

o Physical Memory Address register (PMA) , which holds the 
physical memory address of the next instruction. 

o Cache (data and directory) : high-speed semiconductor memory 
that stores copies of data from regular memory in order to 
speed up memory fetches. (MCA20 allows up to 2K of storage; 
MCA25 allows up to 4K of storage.) 

o Memory Buffer (MB) , to control the flow of data to and from 
cache, channels, memory, and the EBOX. 

o Cache/MB interface, connecting cache to MB. 
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In addition^ a number of buses and interfaces may be connected to the 
MBOX, EBOX, and other parts of the system, such as: 

o E/M interface connects the MBOX and EBOX. 

o S/X BUS /MB interface connnects the MBOX with the core/MOS 
controllers. The DMA.20 is on the SBUS and interfaces to 
external memory . 

o EBUS connects the EBOX to four DTE20s or eight RH20 slots 
(which may contain RH20 or KLIPA/KLNI controllers) and the 
DIA20/DIB20 interface to the traditional I/O bus devices. 

Combinations of the following modules connect memory and MASSBUS 
devices : 

o Channel/MB interface connects MB with the channel controller. 

o Channel controller controls the flow of data through the 
CBUS . 

o CBUS and CBUS interface handles data transfers that go 
directly to the MBOX, bypassing the EBOX. 

o RH20 MASSBUS controller connects the CBUS to the MASSBUS. 

o MASSBUS is a standard bus for interfacing tapes and disks to 
the KL. 

o Device controller (BAIO, TDIO, RHIO,...). 

o I/O bus (PTP, PTR, ...). 

o Channel interfaces (DXIO, DX20,...). 

o CI20 port connecting the KLIO with the CI20 bus. 

o NIA20 port connecting the KLIO with the Ethernet cable. 

The KLIO dynamically generates parity in the following places: 

o On the output side of the channel status RAMs 

o On the output side of the AR 

o Entering the pager from MB or AR 

o Data stored in fast memory 

o Data stored into the channel data buffers (18-bit parity is 
generated) 

Parity is checked after the following operations: 

o On all requests from the MBOX 

o Data leaves MB to go to the DMA20, pager, channel, cache, AR 
or the arithmetic extender 

o Data is paged out 

o Data enters and leaves the RH20 or the MASSBUS 
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o Data enters the AR from the MBOX 

o Data enters and leaves AR during DTE PI Level interrupt 
handling 

o Data enters the ARX from the MBOX 

o Data leaves fast memory 

o Control leaves CRAM/DRAM 

Errors detected through parity checking in the last two conditions 
cause the KL (EBOX/MBOX) clock to halt immediately;^ provided that the 
correct conditions have been enabled. The relationships among the 
places where errors are detected and the condition they evoke is shown 
in the following table. Note that parity is generated by the 
transmitting device. This table does not include power-fail 
conditions . 



Table 5-1 : Hardware Errors 



Component. 



Error 



Error Indicator 



MA20 
DMA20 

MB 
Pager 



Incomplete cycle 
Address parity error 

Data parity error 
Address parity error 
NXM error 

Data parity error 
Nonexistent memory 



Page table parity error 
Pager to cache directory 

Arithmetic Logic: 

(AR, ARX) AR parity error 

ARX parity error 



RH20 
DXIO 



AR/ARX/EBUS parity error* 
Data parity error 
Data parity error 



SBUS error bit 
Address parity bit 

SBUS error bit 
Address parity bit 
SBUS error bit 

MB parity error bit 
NXM error bit 

Page fail trap 

code=25 

CD parity error bit 



Page fail trap 
code=36 (for Exec) 
code=76 (for User) 

Page fail trap 
code=37 (for Exec) 
code=77 (for User) 

I/O page fail bit 

Device interrupt 

Device interrupt 



* This type of error includes any type of paging failure while PI 
CYCLE is set. The PI CYCLE is a microcode condition that is 
enabled when the microcode honors a PI request and is disabled 
when the first XPCW instruction occurs for Levels 1-7 or a Level 
request is completed. 
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5 . 2 STOPCODES 

Stopcodes are symbolic names representing errors detected by the 
monitor. Stopcodes are generated by the STOPCD or BUG. macros. The 
DIE routine records error information and initiates a reload, if 
required. For a complete list of stopcodes, refer to the Stopcodes 
Specification . 

The CTY for each CPU in a multi-CPU configuration records the 
stopcodes that occur on that CPU. You can use FILDDT to find the 
module where a stopcode is defined. You can find a stopcode in the 
crash file by looking for a symbol of the form S..name (for 
3-character stopcode names) or just name (for 6-character stopcode 
names) . The following example shows how to find the module where a 
KSW stopcode is defined: 

S. .KSW? 

TAPSER G 

Stopcodes are defined in many modules of the monitor, but they are 
generated by the same macro, the STOPCD macro. The STOPCD macro is 
called with: 

STOPCD cont, type, name, disp 

where : 

cont is the location to jump to after processing the error. 

type is the type of failure and determines the specific 
course of action. It can have one of the following 
values : 



o 


HALT 


o 


STOP 


o 


JOB 


o 


CPU 


o 


DEBUG 


o 


INFO 


o 


EVENT 



name is the unique stopcode name. 

disp is the address of the routine containing additional 
information, if appropriate. 
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The severity of the error is indicated by the type of stopcode. The 
types of stopcodes are: 

o HALT stopcodes occur after the most severe errors . The CPU 
cannot continue automatically after a HALT, no additional 
information is displayed on the CTY, and no information is 
saved (no crash file is automatically created) . HALT 
stopcodes are also the least likely of the stopcodes to 
occur, and are usually caused by recursive calls to the DIE 
routine . 

HALT stopcodes indicate serious problems that endanger 
further system operation. The RSX-20F console front-end 
(using the HALT.CMD file) gathers pertinent status and error 
information. 

o STOP stopcodes are the also serious, and cause the system 
(all CPUs) to put their status into memory and wait for the 
policy CPU to dump and reload the monitor. 

o JOB stopcodes are those that affect only one job but may 
indicate problems in the system. If there is an interrupt in 
progress, the system will be reloaded. If not, only the 
faulty job will be terminated. Then a dump is taken and the 
system continues . 

o A CPU stopcode is important only for multiple-CPU systems. 
This stopcode will stop only the current CPU, leaving the 
others running. It acts as a STOP stopcode in any of the 
following cases: 

Single-CPU systems 

Only one processor running in an multiple-CPU system 

If DF.CPl is set in the DEBUGF word. 

o A DEBUG stopcode affects the system in different ways, 
depending on the contents of the DEBUGF word (short for DEBUG 
Flags) . By setting certain bits in this word, a system 
programmer can control the effect of certain stopcodes, and 
manner in which the system is reloaded. The DEBUGF flags are 
listed in Section 6.3. 

o An INFO stopcode displays a message on the CTY and rings the 
terminal bell, informing the operator of an event that may be 
of interest . Most INFO stopcodes are harmless and can be 
ignored. They do not halt the system or job, do not initiate 
a memory dump, and do not cause a system reload. 

o An EVENT stopcode displays a message on the CTY, similar to 
an INFO stopcode, but does not ring the terminal bell. 



5.2.1 Stopcode Processing 

The DIE routine in ERRCON processes stopcodes in the following manner: 

1. Increments .CnDWD to indicate that this CPU has died and to 
protect the code from being entered twice by that CPU. 

2. Saves the PI status in .CnCPI and turns off the PI system. 
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3. Saves AC Blocks 0, 1, 2, 3, and 4 in memory. 

4. Stores stopcode PC in %SYSPC and .CnSNM. 

5. Sets up error stack from ERnPDL. 

6. Creates CPU and device status block data using RCDSTB, and 
calls DAEMON to output those buffers. 

7 . Initiates a cache sweep and waits with control in the ACs 
until the sweep is finished. 

8 . Enters the secondary protocol . 

9. Attempts to get the DIE interlock. 

10. Prints stopcode information on CTY. 

11. Dispatches to the routine that will take the dump and handle 
the specific type of stopcode. 

INFO and EVENT stopcodes perform all the functions listed here, except 
that they do not turn off the PI system, do not halt the system, and 
do not perform a dump and reload. The EVENT output on the CTY is 
formatted differently from the other types of stopcodes . 



5.2.2 Continuing from Stopcodes 

JOB and DEBUG stopcodes do not ordinarily crash the system. They 
allow error collection to be done, and then the system can continue. 
Whenever a JOB or DEBUG stopcode occurs, the default action of the 
monitor is to dump memory to disk for later analysis. This is known 
as a continuable stopcode dump and is handled by BOOT. This allows 
the system to continue to do work even though the state of the machine 
is being saved. 

The majority of stopcodes are caused by a corruption of some portion 
of the monitor's database. Often, a corrupted piece of data will 
cause several stopcodes, one right after the other. However, the 
first dump is the most important. When you are analyzing a series of 
crashes, look at the first crash in the series. 

If two or more crashes have the same time stamp, you should look at 
the dump with Bit 8 clear in the DEBUGF word. You can probably ignore 
the other dump(s) . Refer to Section 6.3 for more information about 
DEBUGF flags . 



5.2.3 Special Stopcodes 

Certain stopcodes occur more frequently because they represent a wide 
range of problems. Under these conditions, debugging becomes more 
difficult . The stopcodes of this type that you should be aware of are 
KAF, IME, UIL, and EUE . The causes for them mentioned in the 
following paragraphs are not complete, but they illustrate the way 
such a stopcode could occur. 
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Keep-Alive Fail (KAF) stopcodes occur when the system is hung or 
looping. In this situation, you cannot get response from the 
terminals, there are no jobs running, and no I/O is being done. 
Eventually, the front-end, RSX-20F, realizes the keep-alive count has 
expired, and forces the KL to execute the instruction in physical 
location 71 of memory, XPCW@.CnKAF, which stores the contents of P in 
KFnSVP, and issues the KAF stopcode . The address (a double-word PC) 
of the instruction that was being executed is stored at APnKAF and 
APnKAF+1. 

A KAF occurs when something prevents the processor from reaching clock 
level, thus preventing the keep-alive count from being updated and 
scheduling from being done. This can occur if a process at a higher 
PI level never exits, which could be caused by one of the following: 

o A higher level interrupt goes into an infinite loop. 

o A higher level interrupt does not clear an interrupt signal 
when the interrupt routine exits. The signal, being 
constantly asserted, causes one interrupt after another. 

o The clock does not tick because it has malfunctioned. 

o The clock does not tick because the PI system has been 
disabled. 

o A monitor routine does not release an interlock. 

o A CPU in a multiple-CPU system does not release a CPU 
interlock . 

IME stands for Illegal Memory Reference from Executive and is issued 
when an unexpected page fault occurs in exec mode. Some of the 
potential causes for an IME include: 

o An attempt to write into the monitor's high segment. 

o An attempt to reference data mapped through a UPT that is not 
addressable . 

o Invalid indexing because accumulators were misused. 

To solve IMEs, you can look at the following locations in the UPT: 

o .USPFW (location 500) contains the page fail word. 

o .USPFP (501) contains the flags in the left half. 

o .USPFN (502) contains the PC of the page fail instruction. 

The CDB also contains some relevant information, referenced by the 
folj-owing symbols: 

o .CnAPC contains the APR error or trap PC on this CPU. 

o .CnPFW contains, the page fail word on traps to SEILM. 

o .CnPPI contains the results of CONI PI, on a parity/NXM trap. 

o .CnTCX contains the page fail word context word on traps to 
SEILM. 
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EUE stands for Executive UUO Error and occurs when the monitor 
attempts to execute an illegal UUO (usually with an opcode of 0) . 
This stopcode is usually the result of the monitor branching to an 
address that contains data instead of an instruction. Its causes are 
very similar to that of an IME . The same problem may produce an EUE 
one time and an IME another time, depending on specific conditions. 

To solve EUE and UIL stopcodes, you should look at the contents of the 
following locations in the UPT: 

o .USMUO contains the flags and left half of the UUO. 

o .USMUP contains the address of the UUO routine. 

o .USMUE contains the effective address half of the UUO. 

o .USUPF contains the process context word at the time of the 
UUO. 



5.3 ERRORS DETECTED BY RSX-20F 

When the RSX-20F console front-end detects certain KL error 
conditions, it collects data using command files (sometimes called 
TAKE files) . The error conditions and the command file for each are 
listed below. 

The command files are used to gather status and error data for special 
cases, and (on single-CPU systems) to assist in system continuation 
after a stopcode. 

When the RSX-20F reload-enable flag is set, the following command 
files are automatically executed for the following conditions: 



File 



Error Condition 



CLOCK.CMD Field service probe clock error stop 

CRAM.CMD Control RAM (CRAM) clock error stop 

DRAM.CMD Dispatch RAM clock error stop 

EBUS . CMD EBUS parity error 

FMPAR.CMD Fast memory parity clock error stop 

DEX.CMD Deposit/Examine failure 

HALT. CMD KL executes HALT instruction 

TIMEO.CMD Protocol timeout condition 

KPALV.CMD Keep-alive failed condition (*) 

DUMP. CMD Optional system hung file 



* When a Keep-Alive Fail occurs, the KPALV.CMD file is not used 
immediately. Instead, RSX-20F attempts to reload the monitor at 
location 71 (described in Section 5.2.3) . If the front-end fails to 
reload the monitor, RSX-20F takes a Keep Alive Fail and executes the 
KPALV.CMD file. However, if the Retry-Enable Flag (which is set, by 
default) is cleared, the KPALV.CMD file is executed immediately 
without trying a reload. 

The KPALV.CMD is useful when the system hangs without doing any 
productive work. You can execute KPALV.CMD to gather status 
information and force a dump. To invoke KPALV.CMD, type the 
following commands on the CTY: 



PAR>TAKE KLPALV 



; <CTRL-backslash> 
/initiates the .CMD file 
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CHAPTER 6 
DEBUGGING THE MONITOR 



There are two ways to make corrections to the monitor. The first 
method is to alter the running monitor using the monitor-specific 
FILDDT. You can use this method when the changes are small and it is 
unlikely that the system will crash due to patching errors . The 
second method involves taking the system standalone and loading the 
monitor with EDDT . 



6.1 PATCHING WITH FILDDT 

The monitor-specific FILDDT contains functions that allow you to 
change or patch the running monitor. To run FILDDT and patch the 
monitor, you must use the following commands: 

.R MONDDT 
File: /M/P 

The /M switch indicates that all Examine and Deposit functions will 
refer to the running monitor. The /P switch allows you to patch the 
monitor. To use these switches, your job must have PEEK and POKE 
privileges . 

Often the changes to be added in the monitor do not fit easily into 
the existing code. To add several lines of code, you must access the 
pre-allocated patching space that is resident in the running monitor. 
The patching space starts at the address pointed to by the symbol 
PATCH. The amount of words reserved for patching space is assembled 
into the monitor module PATCH. MAC (the symbol is PATSIZ) , but the 
patch area is usually 50 (octal) words long. It is recommended that 
large changes be made directly to monitor sources, not to the running 
monitor. 

CAUTION 

When you install a change to the running monitor, 
remember that the monitor code should not dispatch to 
the patched location until you have installed the 
entire patch. Therefore, the instruction that 
dispatches to the changed code should be the last 
instruction you install. It is recommended that you 
use the $< command to FILDDT specifying PATCH as the 
patching area. 
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6.2 USING EDDT 

EDDT is a version of DDT that runs in both user and exec modes. EDDT 
is part of the monitor, in the sense that it resides in the monitor's 
.EXE file and is loaded into core with the monitor. The command to 
BOOT to enable debugging with EDDT is : 

BOOT>monitor-filespec/EDDT 

The /EDDT switch instructs BOOT to start at the EDDT start address 
rather than the monitor's normal starting address. You can type /EDDT 
or /START: 401. 

When BOOT starts the monitor at location 401, the CPU is running 
unmapped. In this mode, EDDT could run, but the symbol table is 
inaccessible. Since this situation would provide only limited 
debugging capabilities, the monitor sets up minimal page mapping. 
When this is done, all monitor code and the symbol table will be 
accessible from EDDT. The monitor than jumps to EDDT. 

When EDDT starts, it displays "EDDT" on the CTY and it is similar to 
user-mode DDT. There is no prompt, and the command syntax is nearly 
identical to DDT. For more information on the exec-mode debugging 
commands, refer to the TOPS-10 DDT Manual . 



6.2.1 Starting the Monitor 

When the monitor is loaded into core, data storage mapping and devices 
have not been configured. However, most of the useful information on 
the status of the monitor is contained in the monitor's high segment. 

The monitor will be mapped after you start it, but normally the 
monitor's symbol table, EDDT, and the SYSINI locations are cleared 
after initialization. You can preserve the symbol table, EDDT, and 
SYSINI initialization code by starting the monitor at location DEBUG, 
using the following command to EDDT: 

DEBUG$G 

On a normal startup, the monitor discards its symbol table, EDDT, and 
SYSINI initialization code. The address space is reclaimed for the 
monitor's Section free core pool. However, when you use EDDT to 
load the monitor (using the DEBUG$G command) , this address space is 
preserved, and the symbol table is moved into Section 35 (KLIO) or out 
of the monitor's address space into unmapped core (KSIO) . A pointer 
to the physical address of the symbol table is stored in the Exec Data 
Vector for use by EDDT. 



6.2.2 Breakpoints 

You can insert breakpoints anytime after the EDDT prompt. Unless you 
are debugging system initialization code, it is useful to set an 
initial breakpoint at the label "HIGHIN" . When this point in the code 
has been reached, the monitor is ready to run. That is, all other 
CPUs have been started, channels can be autoconf igured, and so forth. 
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After the monitor starts running, you can type <CTRL/D> on any CTY to 
enter EDDT on the current CPU. SCNSER intercepts the <CTRL/D> 
character at interrupt level, saves the contents of the current AC 
block, and executes an unsolicited breakpoint entry into EDDT. Then 
you can type any valid EDDT command on the CTY. You can resume 
monitor execution by typing $P . SCNSER will ignore the <CTRL/D> 
character that caused control to pass to EDDT. The <CTRL/D> facility 
is controlled under timesharing by the use of the following monitor 
command on the CTY: 

.SET EDDT BREAKPOINT [OFF/ON] 

The default setting for this command is ON when Bit is set in the 
DEBUGF word. 



6.3 DEBUGF FIiAGS 

The DEBUGF word contains the following flags, which can be set and 
cleared using OPR commands . The most useful flag for the systems 
analyst is Bit 0, the sign bit. This flag indicates that EDDT is 
loaded for debugging the monitor and enables breakpointing monitor 
code. 



Bit Name 






DF, 


. SBD 


1 


DF, 


.RDC 


2 


DF, 


.RJE 


3 


DF, 


.NAR 


4 


DF, 


.CPl 


5 


DF, 


.DDC 


6 


DF, 


.DJE 


7 


DF, 


.DCP 


8 


DF, 


.RQC 


9 


DF, 


.RQK 


10 


DF, 


.RQN 


11 


DF, 


.WFL 


12 


DF, 


.DDC 


13 


DF, 


.RIP 


14 


DF 


.RAD 


15 


DF, 


.RLD 


18 


DF, 


.BPO 


19 


DF, 


.BPl 


20 


DF, 


.BP2 


21 


DF, 


.BP3 


22 


DF, 


.BP4 


23 


DF, 


.BP5 



Description 

System being debugged (EDDT loaded) . 

Reload on DEBUG stopcodes , 

Reload on JOB stopcodes . 

Do not automatically reload. 

Stop entire system on any CPU stopcode. 

Do not output a memory dump on a DEBUG stopcode . 

Do not output a memory dump on a JOB stopcode . 

Do not output a memory dump on a CPU stopcode . 

Start CRSCPY program to copy the previous crash 

file at the time of the next clock tick on the 

policy CPU. 

Call KDPLDR on the next clock tick. 

Call KNILDR on the next clock tick (obsolete) . 

Copy output to FRCLIN at system CTY. 

Disable next CRSCPY request . 

Reload in progress (RECON. function .RCRLD) 

Reload after dump (don'^t dump twice in BOOT) . 

Stopcode caused by a reload (used CRSCPY) . 

Can enter EDDT on CPUO using XCT .CODDT. 

Can enter EDDT on CPUl using XCT .CIDDT. 

Can enter EDDT on CPU2 using XCT .C2DDT. 

Can enter EDDT on CPU3 using XCT .C3DDT. 

Can enter EDDT on CPU4 using XCT . C4DDT . 

Can enter EDDT on CPUS using XCT .C5DDT. 



For example, suppose you want to stop the system before reloading to 
reconfigure the hardware. To do this. Bit 3 in the DEBUGF word should 
be set. To disable automatic reloads, run the OPR program and type 
the following commands to CONFIG: 

.R OPR<RET> 
OPR>ENTER CONFIG<RET> 
CONFIG>SET NO AUTO-RELOAD<RET> 
CONFIG>EXIT<RET> 
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6.4 MULTI-CPU ENVIRONMENT 

Debugging a multiple-CPU system requires special considerations. EDDT 
performs all terminal I/O for the CTY that encountered the breakpoint. 
It is not unusual to use all CTYs on the system during a debugging 
session. 

When a CPU stops at a breakpoint, normally the other CPU(s) will 
continue to run. If the breakpoint occurred on a non-policy CPU, the 
CTY on the policy CPU will report the following message: 

problem on CPUn . . . 

However, if the breakpoint occurs on the policy CPU, a role switch 
occurs and another CPU assumes the role of the policy CPU. Although 
this behavior is desirable during timesharing, the role switch makes 
it very difficult to debug a multiple-CPU monitor when more than one 
CPU is running. Also, when the CPUs in the system detect the fact 
that one of the CPUs is not running, interlocks owned by the halted 
CPU are broken. If the CPU was actually paused at a breakpoint, and 
then continued, CIB stopcodes can occur. 

To prevent role switching, a flag (DEBCPU) is set, and contains the 
CPU number on which you typed DEBUG$G. DEBCPU is checked in the 
BRKLOK and BECOMO routines, to prevent possible role switches. This 
may be circumvented by patching a JFCL at DDTCPU prior to typing 
DEBUG$G. 

Monitor messages are sent once per hour on the CTY. The following 
patch will circumvent this BIGBEN routine: 

BIGBEN/POPJ P, 



6 . 5 CAUTIONS 

Remember, EDDT provides little protection against user errors. Keep 
the following points in mind when you are debugging a running monitor: 

o EDDT cannot execute a UUO when you issue the $X and $$X 
commands. This is a restriction. Attempts to do this on a 
KL usually result in a PI Level Interrupt Error from 
RSX-20F. The monitor performs some UUOs internally, in the 
SAVE/GET code, and the CLOSE and FINISH commands. 

o You can change the AC block for EDDT when the monitor is at a 
breakpoint and you wish to deposit data into an AC block 
other than the current one. Use the following co:mmand to 
change to the AG block you specify (n) : 

n$4U 

Do not attempt to use AC Blocks 6 or 7 on a KLIO. This will 
crash the system because the microcode uses portions of AC 
Block 6 and all of AC Block 7 . 

o On a multiple-CPU system, there are locations in ONCMOD and 
SYSINI where tjie CPU must wait for another CPU to finish an 
operation. If tjiat other CPU is halted at a breakpoint, the 
waiting CPU will time out . You must devise specific patches 
at CPUXCT to prevent this situation. 
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APPENDIX A 
GLOSSARY 



The table below provides an alphabetized list of the abbreviations and 
acronyms used in this manual, with expanded names to define them. 



Table A-1 : Glossary of Acronyms 



Acronym 



Meaning 



AC 

APR 

BR 

CDB 

CFP 

CHN 

CI 

CPU 

CRAM 

CTY 

CX 

DDE 

DDT 

DRAM 

EBR 

EPT 

EVM 

FM 

I/O 

I ORB 

IPCF 

IR 

JDA 

KDB 

KON 

LDB 

MB 

MFD 

MQ 

MUUO 

NI 

NZS 

PC 

PDB 

PI 

PMA 

PPB 

PPN 



Accumulator 

Arithmatic Processor 

Buffer Register 

Central Processing Unit Data Block 

Compressed File Pointer 

Channel Data Block 

Computer Interconnect 

Central Processing Unit 

Control Random-Access Memory 

Console Terminal 

A job context 

Device Data Block 

DEC Debugging Tool 

Dispatch Random-Access Memory 

Exec Base Register 

Exec Process Table 

Exec Virtual Memory 

Fast Memory 

I nput / Output 

Input/Output Request Block 

Interprocess Communication Facility 

Instruction Register 

Job Device Assignment table 

Controller Data Block 

Disk Controller Data Block 

Line Data Block 

Memory Buffer 

Master File Directory 

Multiplier/Quotient Register 

Monitor UUO (see UUO) 

Network Interconnect 

Non-Zero Section 

Program Counter 

Process Data Block 

Priority Interrupt 

Physical Memory Address 

PPN Data Block 

Project-Programmer Number 
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PTY 
PWQ 
RAM 
RIB 
SAT 
SCA 
SCS 
SFD 
SMP 
SPR 
SPT 

STR 
TKB 
TTY 
TUB 
TWQ 
UBR 
UDB 
UFD 
UNI 
UPT 
UUO 
VMA 



Pseudo -Terminal 

Position Wait Queue 

Read-Access Memory 

Retrieval Information Block 

Storage Allocation Table 

Systems Communications Architecture 

Systems Communications Services 

Sub-File Directory 

Symmetric Multiprocessing 

Software Performance Report 

Special Pages Table (for mapping) 

Storage Allocation Pointer Table (for disk I/O) 

Structure Data Block 

Tape Controller Data Block 

Terminal » 

Tape Unit Data Block 

Transfer Wait Queue 

User Base Register 

Unit Data Block 

User File Directory 

Disk Unit Data Block 

User Process Table 

Unimplemented User Operation (monitor call) 

Virtual Memory Address 
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APPENDIX B 
ADDRESS SPACE LAYOUT 

Monitor Code Section Layout 

NOTE 

The specifications shown in the following figures are 
subject to change without notice. Addresses are shown 
for comparison purposes only; actual addresses may be 
different depending on your specific monitor 
configuration . 
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Monitor Code Section Layout 



00, ,000000 



00 

00 

00 

00 
00 

00 
00 

00 
00 

00 

00 

00 
00 

00 
00 

00 
00 

00 

00 

01 
01 



073777 

074000 

245777 

246000 
327777 

330347 
334777 

335000 
337777 

340000 

726777 

727000 
733777 

734000 
735777 

736000 
737777 

740000 

777777 

000000 
777777 



Traditional "Low Seg" 
COMxxx data structures, Exec page 
maps. Interrupt vectors & code. 
Prototypes DDEs, Job (JBT) Tables 

PTY DDEs, TTY DDEs, Monitor free 
core, KDBs, UDEs, PDEs, Context 
blocks, etc. 

Void 



Common Subroutines 



Void 



Traditional "High Seg", Pure code, 
UUO calls. Device drivers, IPCF, 
ENQ/DEQ, ANF, etc. 

Void 



Per-CPU COB mapping 



Void 



Job Per-process mapping 
UPT, Extended-exec-PDL, Disk DDEs, 
TMPCOR, pathological names, .TEMP, 
.JEPK, ect . map slots 



Monitor Section One 
(mapped identically to Section Zero) 



Figure B-1: Monitor Code Section Layout 
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ADDRESS SPACE LAYOUT 



DECnet Code Section Layout 



02, ,000000 



02 

02 

02 

02 
02 

02 
02 

02 
02 

02 
02 

02 
02 

02 
02 

02 
02 



073777 

074000 

245777 

246000 
327777 

330000 
334777 

335000 
627777 

630000 
717777 

720000 
733777 

734000 
735777 

736000 
737777 



02, ,740000 



02, ,777777 



Traditional "Low Seg" 
COMxxx data structures. Exec page 
maps. Interrupt vectors & code. 
Prototypes DDEs, Job (JBT) Tables 

PTY DDEs, TTY DDEs, Monitor free 
core, KDBs, UDBs, PDEs, Context 
blocks, etc. 



Void 


Common 


Subroutines 


Void 


"Sky Hi 
DECnet 


Seg' 
code 




Void 


Per-CPU 


CDB 


mapping 


Void 



Job Per-process mapping 
UPT, Extended-exec-PDL 
Disk DDEs, TMPCOR 
Pathological names 
.TEMP, .JBPK, ect . map slots 



Figure B-2: DECnet Code Section Layout 
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03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 

03 
03 



ADDRESS SPACE LAYOUT 

Monitor Data Section 3 Layout 

4 



000000 
017777 

020000 
037777 

040000 
057777 

060000 
174777 

175000 
277777 

300000 
407777 

410000 
517777 

520000 
543777 

544000 
547777 

550000 
553777 

554000 
777777 



PAGTAB 



PT2TAB 



MEMTAB 



Disk Cache 
"NZS" free core 



Void 



DECnet "MB" pool 



DECnet free pool 



DECnet name-to-address 
translation table 



KLNI free pool 



LAT free pool 



Void 



Figure B-3: Monitor Data Section 3 Layout 
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Monitor Data Sections 4,5 Layout 

+ 



04, ,000000 
04, ,051777 

04, ,052000 
04, ,777777 



05 
05 

05 
05 

05 
05 

05 
05 

05 
05 

05 
05 

05 
05 

05 
05 



000000 
004777 

005000 
121777 

122000 
165777 

166000 
166777 

170000 
171777 

172000 
172777 

173000 
176777 

177000 
777777 



SCNSER TTY LDBs & Chunks 
Void 



SCA Free pool 

SCA Datagram buffers 

SCA Message buffers 



SCA Connect ID table 



KLIPA BSDs 



KLIPA BHDs 



LAT "extra allocation" 
Void 



Figure B-4 : Monitor Data Sections 4,5 Layout 
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ADDRESS SPACE LAYOUT 



Monitor Data Sections 6,1 Layout 





+ 


06 


,000000 ! 


06 


.,007500 1 


06 


.,007500 ! 


06 


.,012250 1 


06 


,012250 ! 


06 


.,014650 1 


06 


.,014650 1 


06, 


,017250 1 


06 


,017250 1 


06, 


,034530 1 


06, 


,034530 1 


06, 


,051777 1 


06, 


,052000 1 


06, 


,777777 1 



07, ,000000 
07, ,003777 

07, ,004000 
07, ,076777 

07, ,077000 
07, ,122777 

07, ,123000 
07, ,777777 



BOOT 


DXIO 


(DXMPA) 


ucode 


DX20 


(DXMCA) 


ucode 


DX2 


(DXMCD) 


ucode 


KLIPA (KLPCOD) ucode 


KLNI 


(KNICOD) 


ucode 


Void 




Swapp 


ing SATs 




Disk 


SATs 




SAT free core 


Void 



Figure B-5: Monitor Data Sections 6,7 Layout 
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ADDRESS SPACE LAYOUT 

Monitor Data Sections 35,36,37 Layout 

+ 



35, ,000000 
35, ,252777 

35, ,253000 
35, ,777777 



Symbol table for EDDT while 
debugging, otherwise void. 



Void 



36, ,000000 I 
36, ,777777 | 
+ ■ 



SNOOPY Scratch space 



37, ,000000 
37, ,677000 

37, ,700000 
37, ,737777 

37, ,740000 
37, ,777777 



Void 



Exec section maps 



User section maps 



Figure B-6: Monitor Data Sections 35,36,37 Layout 
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INDEX 



-A- 



AC blocks 

finding, 3-13 

switching;^ 3-5, 4-6 
Access 

codes, 3-2 

table (ACC) , 4-27 
Accumulators, 2-5 

locations, 3-13 

monitor, 3-6, 4-2 

saving, 3-9 

scheduler, 3-13 

traps, 3-12 

user, 4-8 
Addressing non-zero sections, 3-4 
Allocating disk space, 4-23 
Alternate page maps, 3-3 
ANF-10 networks, 4-12 
APR interrupts, 5-3 
APRSER module, 5-4 
AR/AKX parity errors, 5-5 
Arithmetic Logic Unit (ALU) , 5-6 
Assigning channel numbers, 4-11 
Attached terminals, 4-17 
AU resource, 4-10 
AUTCON module, 3-14, 4-20, 4-26 
Automatic reloads, 2-2 
AVALTB table, 4-10 



30-bit addressing, 3-4 
Blocking 

programs, 3-5 

user jobs, 3-6 
BOOT, 2-1, 2-2 
Booting systems, 2-2 
Break characters, 4-16 
Breakpointing monitors, 6-2 
BUG. macro, 5-9 
Building monitors, 4-4 
Byte pointers, 4-4 

-C- 

Cacheable pages, 3-2 
Caching 

disk information, 4-28 

UPT locations, 3-6 
CALLI UUOs, 4-11 
CDB 

constants area, 4-9 

defining locations, 4-9 

variables area, 4-9 
Changing AC sets, 6-4 
Channels, 4-11 

data blocks (CHN) , 4-19, 4-20, 
4-27 



Channels (Cont . ) 

error report, 5-4 

status bits, 4-11 
Checking parity, 5-7 
Chunks 

counts, 4-16 

terminal, 4-16 
Clearing virtual addressing, 2-5 
Clock, 3-12 
CLOCKl module, 3-14 
Clusters, 4-23 
CNFDVN location, 2-7 
COMDEV module, 4-5 
Command 

dispatch bits, 4-2 

files 

FILDDT, 2-8 
RSX-20F, 5-13 

tables, 4-11 
COMMOD module, 4-5 
COMMON module, 3-14, 4-5, 4-9, 

4-17, 5-3 
Common modules, 4-5 
Compressed File Pointer (CFP) , 

4-23 
COMTAB table, 4-11 
Concealed mode, 3-4, 3-5 
Conditionals, 4-6 
Connecting devices, 5-7 
CONSO skip chain, 3-9, 3-14 
Console 

front-ends, 5-13 

terminal, 1-1 
Continuable stopcodes, 1-2, 5-11 
Control RAM (CRAM) , 5-6 
Controller data block (KON) , 4-27 
Controlling terminal, 4-17 
Copying crash files, 2-2 
CPNSER module, 3-15 
CPU 

Data Blocks (CDBs) , 2-5, 4-9, 
5-5 

interlocks, 6-4 

stopcodes, 5-10 
Crash 

analysis, 1-1 

files, 1-1, 2-1 

space, 2-1 
Crash files, 1-1, 1-4, 2-1, 2-2 
CRASH.EXE file, 2-1 
Creating 

crash files, 2-1 

FILDDT command files, 2-8 

symbolic FILDDT, 2-3 
CREF 

listings, 4-6 

program, 4-5 
CRSCPY program, 2-1, 2-2 
CTXSER module, 3-15 
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CTY, 1-1 

Current AGs, 2-7 

Cursor position counter, 4-16 

CX resource, 4-10 

CYCLE error, 5-8 

Cycles, 3-12 

-D- 

D36PAR module, 3-15 

DA resource, 4-10 

DDEs, 4-11 

DEBUG stopcodes, 5-10 

DEBUGF word, 5-11, 6-3 

Debugging the monitor, 6-1 

DECnet 

front -ends, 4-12 

layout, 3-3 
Defining 

CDB locations, 4-9 

symbols, 3-15 
Device 

codes, 3-5 

Data Blocks (DDBs) , 4-2, 4-12 

information, 4-12 

interrupts, 3-8 

status word, 4-2 
Devices 

RDA, 4-12 
DEVIOS word, 4-2 
DIE routine, 5-9, 5-10 
DIECDB location, 2-5 
Directories, 4-21 
Disabling 

extended addressing, 2-5 

time messages, 6-4 

user addressing, 2-6 
Disk 

cache, 4-28 

controller data block (KON) , 
4-2 

device data blocks, 4-28 

dual-ported devices, 4-6 

file structure, 4-21 

I/O, 4-21 

on-line information, 4-2 6 

storage allocation, 4-23 
Dismissing interrupts, 3-9 
DISP table, 4-11 
Dispatch RAM (DRZU^I) , 5-6 
DN20 front -ends, 4-12 
Doubleword PC, 3-4 
DTE 

DpBs, 4-12 

i|iterrupts, 3-10 
DTEFRM module, 3-15, 4-7 
DTESER module, 4-12 
Dual -ported disks, 4-6 



Enabling addressing, 2-5 
ENQ/DEQ 

module, 3-15 
ERnPDL stack, 3-13, 5-4 
ERRCON module, 5-10 
Error 

handling, 5-1 

hardware codes, 3-11 

parity, 5-8 

processing routines, 3-13 
ETHPRM module, 3-15 
EUE stopcodes, 5-13 
EV resource, 4-10 
EVENT stopcodes, 5-10 
Exec 

Base Register (EBR) , 3-2 

kernel mode, 3-4 

mode, 3-2, 3-4, 3-5 

Process Table (EPT) , 2-4, 2-5, 
3-2 
Exec-mode DDT, 6-2 
EXECAC macro, 4-6 
Execute-only programs, 3-5 
Executing coiranand files, 2-8 
Execution Box (EBOX) , 5-6 
Executive UUO Error (EUE) , 5-13 
Exiting FILDDT, 2-4 
Extended 

addressing, 2-5, 3-3 

channel table, 4-11 

software channels, 3-3 



-F- 



F module, 3-15, 4-6 
FAKEAC flag, 2-5 
Fast Memory (FM) , 5-6 
Fatal errors, 1-1, 1-2 
Fault continuation, 1-2 
Feature test options, 4-6 
FILDDT 

command files, 2-8 

mapping commands, 2-5 

program, 2-3 
Finding 

AC blocks, 3-13 

DDBs, 4-11 

stopcodes, 5-9 

symbolic definitions, 4-6 
Flag-PC doubleword, 3-4 
Flags for DEBUGF, 6-3 
Forced commands, 4-11 
Forced system dumps, 2-1 
Forcing reloads, 2-1 
Free core, 4-10 
Front-ends, 4-12 
Full clock cycle, 3-12 
Funny space, 3-3 



-E- 



'G- 



Echo count, 4-16 
EDDT, 6-2 



Generating parity, 5-7 
GLOB program, 4-7 
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Global 

section references, 3-4 

symbols, 4-1, 4-7 
Groups of disk data, 4-22 

-H- 

HALT stopcodes, 5-10 
Halting systems, 2-2 
Handling 

errors, 5-1 

interrupts, 3-9 
Hardware 

addressing, 2-5 

error codes, 3-11 

errors, 5-1 

interrupts, 3-14 

mapping, 3-2 
HOME blocks, 4-22 

-I- 

I/O 

channels, 4-11 

Request Block (lORB) ^ 4-19 

status word, 4-2 

tables, 4-11 
IF statement, 4-6 
IME stopcodes, 5-12 
INFO stopcodes, 5-10 
Inserting breakpoints, 6-2 
Instruction Register (IR) , 5-6 
Interlocks between CPUs, 6-4 
Interrupt, 3-6 

accumulators, 3-9 

error-handling, 3-8 

handling routine, 3-8 

levels, 3-6 

PDLs, 3-9 

processor, 5-3 

stacks, 3-9 

Vector (IVIR) , 3-10 
Interrupting 

on Level 0, 3-8 

on Level 7, 3-12 
Intertask communication, 4-12 
INTTAB table, 4-10 
Invalid mapping, 2-7 
IPCSER module, 3-15 
IVIR register, 3-10 

-J- 

JBT tables, 4-7 
JBTPPB table, 4-27 
Job 

context module, 3-15 

Device Assignment table (JDA) , 
4-11 

stopcodes, 5-10 

tables, 4-7 
Job-specific monitor locations, 
3-3 



JOBDAT 

area, 3-6 
locations, 4-8 
module, 3-15, 4-7 
vestigial, 3-3 

-K- 

Keep Me bit, 3-2 

Keep-Alive Fail (KAF) , 5-12, 5-13 

Kernel mode, 3-4 

KL interrupt handling, 3-9 

KL-paging, 2-5, 3-3 

KLPPRM module, 3-15 

KNO word, 4-28 

KS 

alternate page maps, 3-3 
interrupt handling, 3-9 
reloading systems, 2-2 

-L- 

Label DDEs, 4-20 
Line 

characteristics bits, 4-16 

Data Blocks (LDBs) , 4-2, 4-15 
LINTAB table, 4-16 
Loading FILDDT symbols, 2-3 
Local symbols, 4-1 

unlocking, 4-7 
Locating EPTs, 2-5 
Locations 

0-17, 2-5 

30, 2-1 

406, 2-2 

407, 2-2 
500, 3-11 
DIECDB, 2-5 

LOKCON module, 3-15 

Low segment addresses, 2-5 



-M- 



Macros, 4-5 

MACSYM module, 3-15, 4-7 

Magnetic tape devices, 4-19 

Mapping 

AGs, 2-6 

dumps, 2-5 

exec virtual memory, 2-6 

extended sections, 2-5 

user jobs, 2-6 

verification, 2-7 

virtual addresses, 2-4, 2-5, 
3-2 
Master File Directory (MFD) , 4-21 
MCA25 bit, 3-2 
MCB software, 4-12 
Memory 

Box (MBOX) , 5-6 

dump , 1-1 

tables, 4-10 
MEMTAB table, 4-10 
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MIC information, 4-16 
MM resource;^ 4-10 
Mode flag, 3-4 
Modules, 3-13 

common, 4-5 

monitor startup, 3-14 

optional, 3-15 

symbol definition, 3-15 
MONGEN program, 4-4 
Monitor 

ACS, 4-2 

breakpoint ing, 6-2 

building, 4-4 

command processing, 4-11 

functions, 3-1 

macros, 4-5 

modules, 3-13 

name, 2-8 

sources, 4-5 

startup modules, 3-14 

symbols, 4-1 

version numbers, 2-7 
Monitor-resident user data, 3-3 
Monitor-specific FILDDT, 2-3 
MSCPAR module, 3-15 
Multiple-KL systems, 4-6 
MUUO, 3-6 



-N- 



Name Block (NMB) , 4-27 

Nested SFDs, 4-21 

NETDEV module, 4-12 
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SCPAR module, 3-15 
Sections, 3-3 

DECnet, 3-3 

mapping, 2-5 

pointers, 3-2 

references, 3-4 

tables, 3-2 
SEILM routine, 3-12, 5-4 
Servicing interrupts, 3-9 
SET commands, 4-11 
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